(09/11/17 - 05:02PM)
In this project you will be implementing a minimally viable blockchain, called MVB, and – consequently – will instantiate a new coin called MVBCoin.
Disclaimer #1: We are partially misleading you. While you will build many of the core components of a blockchain, there are certain components of actual cryptocurrencies (such as Bitcoin) which we will not have you worry about. Without these components, MVBCoin is actually unsafe and susceptible to attacks (such as double spends). Therefore, hold back on trading your MVBCoins!
Disclaimer #2: We assume that you have already familiarized yourself with the Bitcoin paper [1, 2], and have some rudimentary understanding of the protocol. This document will avoid any algorithmic descriptions, and rather simply focus on the technical details. Please read [1, 2] for protocol specifics.
You will be submitting a .zip file of a directory which you will call mvbcoins. The internals of the directory will be laid out in the following manner:
mvbcoins/
Makefile
README.txt
src/
make
(See make
. For example, your MVBCoins node is required to work on Fractus and may require some instructions for installing a software package. However, most likely (and ideally), README.txt should be empty and without any special instructions. But if you must absolutely request some special instructions, this is the place to put them. We will, however, not accept special instructions on modification of system-wide environment variables, such as – for example – special setups to GOPATH and GOROOT if you are using GO.After calling make
, mvbcoins will look as follows
mvbcoins/
Makefile
README.txt
node
src/
If node is compiled in some strange directory, you will be penalized.
The output executable node will be run with the following command line arguments:
node can be written in any of the following languages: C, C++, Javascript, Python, Golang, Java, or Rust. If you want to use a language not listed, ping the TA (Kevin). If you are submitting your program in an interpreted language (such as Python), please ensure that it is wrapped up in a shell command, which then calls the interpreter. All tests will be run as:
./node arguments
Important: Your submission must work on Fractus. In general, a 32/64-bit Linux machine (not any arbitrary Unix, and definitely not Windows), but, in particular, Fractus from MP0.a (Eucalyptus) or MP0.b (OpenStack).
If your project cannot be compiled from the command line using the make file and needs some special IDE or tools, your submission will not be accepted.
You will download on CMS an executable called tester. This executable takes in several parameters. Run ./tester -h
to see the list of all accepted arguments and how this tool will help you lay out your wire-protocol requirements.
As an example:
./tester --ports 10000 --testtransaction --withdoublespends --concurrentconns 10 --numtransactions 1000 --verbose
Will launch 10 concurrent nodes that will bombard your implementation on port 10000 using 1000 sequential transactions. Careful about these numbers here. If you enable verbose
for a lot of concurrent connections and transactions, you cause your terminal to flush quite a bit of text. If you accidentally write --concurrentconns 1000000
against your single process (which may not be suited yet to handling many connections), your machine may crash. tester is aggressive, so careful with the commands.
You will also download an executable called server. This executable works in the inverse direction of tester, i.e. it simply absorps anything you send to it and doesn’t reply back. It simply prints it out. Careful about how you connect to server, because it may print things out in a jumbled order if you send multiple concurrent messages to it.
Both of these tools will be helpful tools in sanity checks.
Conventions: All messages will be raw bytes. By convention, if we say:
M = [VALUE1(6B)||VALUE2(32B)]
This means that the message is a 38 byte message, where the first 6 bytes are VALUE1 and the next 32 bytes are VALUE2.
Now, let’s say VALUE1 is, for example, a port number. If the port is 9000, then the most efficient way of encoding the value 9000 is in a 14 bit number, which you round up to 16 bits, for a total of 2 bytes. YOU WILL NOT BE EFFICIENT. You will, on the other hand, encode each digit as a single character, such that each digit is 1 byte. Furthermore, you must prepend the number with any leading zeros. In other words, you may not encode the port with only 4 bytes. You must encode with 6 bytes. In that case, VALUE1 must be the encoded byte string b’009000’, which as a raw byte array is [48 48 57 48 48 48].
Use tester and server to sanity check your message parsing.
Launch procedure: The first node to be launched must be the bootstrap node. This node defaults to port 8888, and is the main entry point for discovery for all other nodes. You can implement the internals of the boostrap node however you would like. However, once launched the bootstrap node remains online and listens for messages of the format:
PORT(6B)||NODE_ADDRESS(32B)
Where PORT is the port number of a worker thread (to be defined), and NODE_ADDRESS is the node address (a node has multiple worker threads). Make NODE_ADDRESS to be:
NODE_ADDRESS = SHA256("your_netid")
The bootnode will respond with a message of the form
OP_CODE_BOOTNODE(1B)||NUM_PORTS(6B)||PORT_1(6B)||PORT_2(6B)...||PORT_N(6B)
where NUM_PORTS = N, and OP_CODE_BOOTNODE defaults to '4'. Make sure that the bootnode stays alive and repeatedly broadcasts its known set of ports.
There will be five types of messages that nodes can exchange with one another:
Where numTransactionsInBlock defaults to 50000.
Note: It is extremely important to make sure your messages conform to the specified format above. If even a single byte is off, it may be possible for your submission to fail. Use tester and server to sanity check your message formats.
It is up to you to manage the internal datastructures of your node. However, here are a few helpful tips on what you should be maintaining:
The Blockchain: The blockchain itself. Caution: You should be careful about wrapping up your blockchain(s) in a datastructure that can maintain multiple forks of the blockchain if need be.
The Mempool: The mempool is the set of all transactions that have not yet been placed in a block that has been mined. In distributed-computing terms, this is the list of all transactions that have been received, but not yet committed.
The UTXO Set: The UTXO set is the set of all addresses that can spend money. Initially, the UTXO set is initialized as follows:
for i = 0 to 100:
add {sha256(convert_to_byte_array(string(i))), 100000} to UTXO set
I.e. there are 100 addresses that can spend 100000 MVBCoins each. Remember to do this initialization when playing around with tester, because it assumes you have initialized this set of accounts.
WARNING: Here be dragons, or however the saying goes! It is important to realize that some transactions may need to be rolled-back. For example, suppose you mine a block with height 500, but another block with the same height happened to be mined at the same time (the other block is also valid). Your block got pushed back by the rest of the miners. You now must somehow be careful about rolling back all the transactions that you may have “pre-committed”. I.e. you must roll-back all account changes to your UTXO set.
For the first milestone, you will be writing a single-process, single-threaded version of MVB. Notice that this means that the command line arguments to node will slight change in functionality. --numworkers will default to 1 and will ignore any number you supply to it, and --ports will only parse the first port and ignore the rest. Your node must do the following:
Note that we are purposefully ignoring the internals of the algorithms chosen to perform your blockchain management. We impose no requirements on how you manage your UTXO set, and how you ensure that you roll-back transactions that you have pre-committed after receiving a new block. We only require that if your node detects multiple blockchains, then the winning blockchain is the one with at least 6 blocks over all others. As a node, you always mine only for the longest blockchain.
Zip your entire mvbcoins software distribution directory and submit the output mvbcoins.zip. Use the following command, exactly:
zip -r mvbcoins.zip mvbcoins
For the second milestone, you will optimize your node, by making usage of concurrency.
Unlike the first milestone, the second milestone will be multithreaded, where --numworkers flag will now indicate the number of working ports that your node supports. You can think of this architectural design as a collection of machines, each listening in on a unique port. As a reminder, --numworkers must be the same as the ports passed using --ports flag.
We will also add an additional flag, --numcores, which indicates to your node how many cores there are available. Make usage of this flag in your architectural design (e.g. --numcores can be the number of threads in a thread pool). So, while --numworkers indicates the number of “machines”, --numcores indicates the number of threads you should be making utilization of. For example, if --numworkers is 2 and --numcores is 4, then you will be listening in on two ports, and the aggregate of all ports should use 4 threads.
You will be handing in two things:
You can assume that other nodes (miners) are not malicious and will only mine valid blocks. However, you may receive transactions from clients that are double spends (or plain invalid). In other words, when you receive blocks (either from broadcasts or from explicit synchronization requests), you can assume the block you receive back has been correctly mined and has no invalid transactions (from the view of the miner that sent you the block). You must be careful about rolling back transactions appropriately if you detect a fork (as we warned before).
TCP or UDP?
TCP.What should the boostrap node do when it gets a new node connection?
Send all known ports to all known ports. If a node has gone down, remove it from the known-port-list.I'm confused about the number of worker threads. Is there only one miner?
There are multiple miners (multiple instances of the process node), but each of the miners has only a single thread.What should I do as soon as I get a new transaction?
Check if the transaction is valid, then to all other miners using the SEND_TRANSACTION message.[1] Bitcoin: A peer-to-peer electronic cash system. Nakamoto, Satoshi. Consulted 1.2012 (2008): 28.
[2] Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction. A. Narayanan, J. Bonneau, E. Felten, A. Miller, S. Goldfeder. Princeton University Press; 2016.
Read Chapters 1 and 2.