Firefly is a tool developed by Runtime Verification, Inc., powered by the KEVM semantics in K.
Running Firefly on your contracts reduces the work needed for formal verification, which means saved time. Consider setting up Firefly on your repository before contacting us for more in depth formal verification services. A definite must for any smart contract developer planning to store and manipulate user funds.
The tools in this list will make use of the concrete execution reference interpreter derived from the semantics. They require no changes to the user’s existing test-suite, which makes beginning to use Firefly quite simple.
Test runner. Firefly contains a reference implementation of the EVM equipped with a Web3 client so that it can be integrated into standard testing platforms for Ethereum. Firefly is a drop-in replacement for ganache-cli (even obeying the same command-line arguments), so users only need to replace their call to ganache-cli with a call to firefly launch when testing their contracts with Truffle or any other Web3-based testing platform. We release standalone binaries, Ubuntu packages, and Docker images to make it accessible on any platform. Our derived interpreter performs comparably to Ganache meaning that users are not inconvenienced with longer wait times when testing locally or on CI.
Our web frontend also integrates directly with the GitHub status checks API, so that users can block merging new or changed code into their repositories which does not pass existing or newly written tests. Our GitHub CI integration is easy for users to add to existing repositories, only requiring that they switch out Ganache for Firefly and adding a single status check via our web interface.
Test coverage. The Firefly client is instrumented to collect bytecode-level coverage information when executing tests. The coverage collection is low overhead, and so is always enabled. In addition to collecting information about whether a given opcode is visited, we also collect information about how the opcode was visited. This can be opcode agnostic (eg. for each opcode we record whether it caused an EVM exception), or opcode specific. For example, the JUMPI opcode in EVM is a conditional branch, so we not only record that the JUMPI was visited, but also which branch of its behavior was taken. An even more subtle example is the ADD opcode, which behaves like normal unbounded integer addition when the result does not overflow, but otherwise will silently overflow and produce a bad result (which can easily lead to bugs where money is lost). For each ADD opcode (and all other arithmetic opcodes), we not only record whether it was visited, but if it overflowed as well. This enables us to inform the user that they have not tested that they handled overflow correctly in our UI. There are many such examples of “silent” branching in the EVM that Solidity-level coverage tools will have a hard time reporting, but are straightforward for us to collect data on. Adding the needed instrumentation for new coverage scenarios has been streamlined, often only requiring a single additional line of K code, so we plan to keep getting feedback from our research sessions with advanced Solidity developers about what they would like reported and integrating it into the tool.
Our GitHub status check interface also allows users to specify their target contract coverage so that when new code is merged the user can be confident that not only are all their tests passing, but their coverage target is met. The user is allowed to configure (or disable) their own coverage target, letting them use this feature as they wish.
Blackbox random testing. Contracts on the Ethereum blockchain are invoked using transactions, which contain ABI encoded data stating which function of the target contract should be invoked and what arguments to invoke it with. Testing tools like Truffle translate the high-level Solidity function calls (written either in Solidity or in JS) to an ABI encoded transaction for the Web3 client to execute. Our tool extracts the contract function names and signatures (available as structured JSON via Truffle’s compile step) and generates random transactions to submit to the contract based on the function signatures. The tool first runs the contracts existing test-suite, measures the coverage, then generates random transactions until the target coverage is reached. Any generated transactions which increase the coverage are presented to the user as human-readable Solidity, so that they can be added to the user’s test-suite (along with the assertions about the post-state of the new transactions which the user must supply). Our tests on the backend prototype are very promising; in many cases, we are able to increase the coverage of many contracts in our test-suite (which consists of a selection of real-world public contracts) to at least 75% automatically.
This information is presented to the user in the already developed coverage UI as a tip on the Solidity lines which the new test exercises. The tool runs quickly (eg. increasing coverage from 25% to 75% in under two minutes), so we will automatically run this tool against coverage reports which are below 70% for a preset amount of time (likely two minutes) and present any additional test inputs the user may want to consider via our web interface. Because our GitHub status checks API automatically places a link to the coverage report in their GitHub Pull Request, if the user sees that their coverage does not meet the desired target they can click the inserted link to the coverage report to see additional test inputs they can use to reach their target coverage.
The next two tools also make use of our concrete execution engine, but require extensions to the user’s test-suite to make use of them. These extensions are designed to be compatible with other testing clients, so the user can still choose to run their extended tests against Ganache (though doing so will not give them the benefit of extending the tests).
Whitebox random testing. Blackbox random testing blindly generates transactions for the given contract to find inputs which are not exercised by the existing test-suite. This approach is good for finding new inputs, but not for stating the user’s intentions. Whitebox random testing allows the user to specify exactly how to randomly test their contract directly in their test-suite (using the Solidity syntax for testing that they are familiar with). We will provide the user with a FireflyWhitebox smart contract which they can make calls into to get random inputs for their tests. The default implementation of these calls will just return a normal number, so the user can still run their tests through other clients if desired, but Firefly will intercept these calls and return a random value to run the test with. By extending the tests to take advantage of the FireflyWhitebox contract, the user will be able to run their tests against large batches of random values instead of a fixed set of inputs. This changes their unit test into a property test, which is a popular approach for increasing assurance via testing in many languages.
Runtime monitoring. Many properties that a user wishes to be true of their contract can not easily be stated as simple unit or property tests. For example, a user may want to state the temporal property that “every time function f() of contract A is called, it is followed by a call to function g() of contract B” (perhaps contract B needs to be synchronized with the changes to contract A). A high-level property like this is more easily stated as a temporal formula over the state of the contracts, to be checked over all executions. We have a prototype of the backend extension which allows the user to supply a temporal formula on the command line, and as the users test-suite is executed the temporal formula is checked continuously as well. This does not require the user to modify their test-suite, and works regardless of where the test execution comes from (ie. it will work seamlessly with the black/white-box random testers). The user only must supply the temporal formulae that they are interested in checking as an extra file provided alongside their test-suite.
The remaining tools will extend the concrete execution based tools to the symbolic realm, taking users from the realm of testing into verification. We picked the symbolic execution tools in such a way that it requires no extra setup for a developer already using our concrete execution tools.
Assertion violation checker. Assertions are used in Solidity code to check that certain states are never met during execution (and to abort execution if they are met). They are often used as sanity checks in code-bases to ensure that even in the presence of bugs, the contract cannot reach an undesirable state. Our symbolic execution tool can be placed in search mode to check that these statements are actually never reachable, and to output a counterexample in the cases where they are reachable. The generated counterexample can be added directly to the users test-suite, so that they can fix the bug which caused the assertion to be reachable and make sure it will not happen again. This requires no modification of the users code or test-suite, which makes it an attractive tool for sanity checking the users code for violations.
Bounded model checker. Runtime monitoring checks that during concrete executions of the users test-suite, certain temporal properties over the contract state are always maintained. This is useful in its own right but incomplete, in the sense that for the executions of a user’s test-suite the property may not be violated, but maybe some other execution does violate the property. Our bounded model checker will instead symbolically check these properties across all executions of the contract, ensuring that there does not exist an execution which violates the property. This requires no modification of the users code or test-suite beyond that required by the runtime monitoring (which is to state the properties the user wants to check), making it easy for the user to transition from monitor based testing to full temporal verification.
Whitebox symbolic testing. When a user writes down a property test for the whitebox random tester, they are asserting that on any execution that property should hold. The whitebox random tester only checks that it is true for some specified number of random concrete inputs, but does not exhaust the entire state-space. Whitebox symbolic testing will instead run the same test with our symbolic execution engine, ensuring that there are no inputs which can violate the property stated by the user. This requires no additional modification to the testing harness above the whitebox random testing, once again providing an easy way for users to smoothly transition from unit testing to property testing to formal verification.