Andreas Zeller, a professor of software engineering at Saarland University and a researcher at CISPA, is working to uncover security vulnerabilities before they are exploited by cybercriminals. "Modern test generators can generate inputs for the program in question at high speed," Zeller explains. "But for it to work, it's essential to know how the input is structured so that the program doesn't allow invalid inputs. This is precisely what our researchers are working on: deciphering exactly how the inputs for these programs need to be constructed."
By looking at a given program and its inputs, Zeller and his doctoral students Matthias Hoeschele and Alexander Kampmann are able to automatically extract what they call a "context-free grammar." This is a description of all valid inputs for a specific program, just as German grammar is a description of correct sentences in the German language. The CISPA researchers also named the matching software system they developed for this core approach after it. The prototype is called "Autogram," a combination of "automatic" and "grammar," and the first results were presented in September 2016 at the Automated Software Engineering conference in Singapore.
"With the grammar that Autogram generates, we can produce millions of valid inputs in minutes, allowing us to test a program more thoroughly," explains Zeller. The large number of inputs significantly reduces the likelihood of overlooking security vulnerabilities, according to Zeller.
To extract the grammar from a specific program, Autogram observes how the program handles a given input. Different parts of the input are processed in different parts of the program, allowing the Autogram system to gather relevant information—data about the structure of valid inputs and their relationship to the program's code. The extracted grammars are indeed very human-readable, as they use identifiers specific to the program's code. “Currently, we’re testing our prototype by allowing it to analyze a wide range of input formats, such as JSON or table data. We’re using around a thousand valid inputs as a base,” says Alexander Kampmann. These inputs will eventually be omitted, however, so that in a subsequent step the grammar can be extracted directly from the program.
Based on the extracted grammar, the researchers can create new test inputs that systematically analyze the program. How to do this efficiently is being investigated in their “tribble” project. “Tribble” uses the grammars provided by Autogram and then systematically compiles all valid input variables and code snippets.
The IT security researchers around Zeller already have extensive experience with grammar-based testing. In 2012, they released their test generator LANGFUZZ, which thoroughly analyzed the Firefox web browser, using a hand-crafted grammar at the time. LANGFUZZ has been in daily use with Firefox developers for four years, and with its help, more than 4,000 bugs and security vulnerabilities have been identified and fixed so far.
So now the researchers in Saarbrücken are expanding its scope from Firefox to virtually any program and input format. "The long-term goal is fully automated security testing, applicable to everything from the smallest Internet of Things gadget to entire servers," says Zeller.
Load balancers with natively integrated Web Application Firewall
KEMP Technologies has announced that its LoadMaster load balancers will natively integrate Web Application Firewall (WAF) services. This will enable secure web application development, preventing Layer 7 attacks, while maintaining core load balancing services.
