ajaxwin commited on
Commit
f93644a
·
1 Parent(s): e5b8b13

feat: Add initial documentation for smart contract problem and solution, including tasks for vulnerability detection and property discovery

Browse files
Files changed (2) hide show
  1. SmartContract.md +22 -0
  2. Tasks.md +81 -0
SmartContract.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Problem:
2
+
3
+ Smart contract code can have bugs. To reduce any possibility of bugs they are formally verified. Problem with formal specification is that the specification itself needs to know what the code intends to do. For ex: Bank contract won't want balance to be in negative. But the formal specification might not know that it may check for integer overflow but not whether it always remains positive. That part is identified by humans and then they update the formal specification.
4
+
5
+ The Certora Prover takes as input a contract and a specification and formally proves that the contract satisfies the specification in all scenarios. Importantly, the guarantees of theCertora Prover are scoped to the provided specification, and any cases not covered by the specification are not checked by the Certora Prover.
6
+
7
+
8
+ ## Solution:
9
+
10
+ The properties are generated by humans based on domain knowledge. That's why they create a database of rules to identify similar contracts
11
+
12
+ and rules. This similar rules give LLM some domain knowledge as to what is expected from the contract.
13
+
14
+ They also provide llm with similar example from their vast vector database. Prover was 38K LOC. They made a custom compiler for PSL and they integrated it with solidity compiler. They are using a prover to check if the smart contract follows the generated property specification.
15
+
16
+ For matching they are identifying a single function (system under test) in a contract and match similar rules/functions in their vector database.
17
+
18
+ Domain Knowledge:
19
+
20
+ 1. Properties are: what. What this function should satisfy?
21
+ 2. Rule is how, so in specification language.
22
+ 3. Invariant is what must be always true for your contract to be safe.
Tasks.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tasks
2
+
3
+ TASK 1 Targeted Vulnerability Detection: (medium)
4
+
5
+ Initial Setup: Single Solidity File
6
+
7
+ Initial Observation: What the contract is about and what it is supposed to do
8
+
9
+ Objective: Vulnerable function and it's issue in 2-5 words or NO if no vulnerability exists
10
+
11
+ Data file contains 7-8 vulnerabilities so make sure to choose a random function every time you reset environment
12
+
13
+ Actions:
14
+
15
+ list_functions, -0.05
16
+ get code of a function, -0.10 if wrong else +0.05
17
+ get function summary of a function (comments), -0.05 if wrong else +0.03
18
+ Correct submission, +5
19
+ Wrong submission, -1.50
20
+ Repeated query, -0.4
21
+ Get file metadata, (header comment), -0.04
22
+ get state variable, -0.05
23
+ get call graph, -0.08
24
+ If unknown action negative score
25
+ TASK 2 Property Discovery: (hard)
26
+
27
+ Initial Setup: One Function from a solidity file from a contract with known properties
28
+
29
+ Initial Observation: What the contract is about and what it is supposed to do
30
+
31
+ Objective: Property generated will be in Natural language related to the function
32
+
33
+ Actions:
34
+
35
+ get a similar rule/property from different contracts sol file, -0.2
36
+ submit a property (reward based on how close it is to original property) (0 - 5) one submission attempt per episode, determinstic checker
37
+ get file natspec comments, -0.03
38
+ get function natspec comments, -0.08
39
+ get code of a function, -0.06
40
+ get related functions, -0.06
41
+ get input and output of the function, -0.04 (Changing it to get signature)
42
+ TASK 3 Rule Checker (easy)
43
+
44
+ Initial Setup: One sol file with at least one function that's breaking the property
45
+
46
+ Initial Observation: Given a property in english (THE MAIN DIFFERENCE)
47
+
48
+ Objective: Identify rule breaking function
49
+
50
+ Actions:
51
+
52
+ Get formalized version of the property, -0.03
53
+ Get list of functions, -0.05
54
+ get function metadata, -0.05
55
+ get code of a function, -0.1
56
+ get state variables, -0.05
57
+ get call graph, -0.08
58
+ submit a function (subfunction of target = 1.5, wrong = -1.5, correct = 5) only one submission per episode
59
+ Data From Projects by Certora:
60
+
61
+ Generated JSON File contains about 7 to 8 vulnerabilites from these 3 projects audit report by certora.
62
+
63
+ AaveVault
64
+ AaveVaultV2
65
+ Lido Finance Report
66
+ Problems:
67
+
68
+ PERHAPS I NEED TO TEST TO FIND THE RIGHT SET OF REWARDS FOR THE ACTIONS, ALSO MAX STEPS
69
+ Does it makes sense to fetch one global invariant at a time, NO ACC TO LLM TWICE
70
+ In vulnerability detection, action file header comment doesn't make much sense acc to LLM
71
+ maybe call graph actions can be revised, like it can be more detailed, instead of getting entire graph at once, get it in parts.
72
+ Vulnerability type doesn't have a particular name, so a function and a descripiton would have to do
73
+ Use Sentence transformer library along with keyword matching to guess the similarity, can't use this
74
+ MAYBE INCREASE SIZE OF THE DATASET
75
+ getfunctioncode can broken into more simpler actions such as get parameters, what it returns
76
+ get state variables action etc
77
+ get property specification, doesn't seem very useful though
78
+ can getting severity be a good action ?? does it help in identifying vulnerabliity, maybe we can give strict reference as to what is highly severe
79
+ CURRENT:
80
+
81
+ Output format by agent is mentioned in system prompt.