diff --git a/README.md b/README.md
index 2d3158f81581eade7c33d0f0736da63708759bd3..736414389792023721e13e7d3dc5569cc45b6d68 100644
--- a/README.md
+++ b/README.md
@@ -1,87 +1,302 @@
-# TCP/IP Stack Design Using Vivado HLS
+# Scalable Network Stack supporting TCP/IP, RoCEv2, UDP/IP at 10-100Gbit/s
 
+#### Table of contents
+1. [Getting Started](#gettingstarted)
+2. [Compiling HLS modules](#compiling)
+3.	[Interfaces](#interfaces)
+	1.	[TCP/IP](#tcp-interface)
+	2. [ROCE](#roce-interface)
+4. [Benchmarks](#benchmarks)
+5.	[Publications](#publications)
+6. [Citation](#citation)
+7. [Contributors](#contributors)
+
+
+<a name="gettingstarted"></a>
 ## Getting Started
 
 ### Prerequisites
-- Xilinx Vivado 2018.1
-- License for Xilinx 10G MAC IP
-- Linux OS
+- Xilinx Vivado 2019.1
+- cmake 3.0 or higher
 
 Supported boards (out of the box)
 - Xilinx VC709
 - Xilinx VCU118
 - Alpha Data ADM-PCIE-7V3
 
-### Installation
+<a name="compiling"></a>
+## Compiling (all) HLS modules and install them to your IP repository
+
+0. Optionally specify the location of your IP repository:
+```
+export $IPREPO_DIR=/home/myname/iprepo
+```
+
+1. Create a build directory
+```
+mkdir build
+cd build
+```
+
+2.a) Configure build
+```    
+cmake .. -DDATA_WIDTH=64 -DCLOCK_PERIOD=3.1 -DFPGA_PART=xcvu9p-flga2104-2L-e -DFPGA_FAMILY=ultraplus -DVIVADO_HLS_ROOT_DIR=/opt/Xilinx/Vivado//2019.1/bin/
+```
+
+2.b)Alternatively you can use one the board name ot configure your build
+```
+cmake .. -DDEVICE_NAME=vcu118
+```
+
+All cmake  options:
+
+| Name                        | Values                | Desription                                                              |
+| --------------------------- | --------------------- | ----------------------------------------------------------------------- |
+| DEVICE_NAME                 | <vc709,vcu118,adm7v3> | Supported devices                                                       |
+| NETWORK_BANDWIDTH           | <10,100>              | Bandwidth of the Ethernet interface in Gbit/s, default depends on board |
+| FPGA_PART                   | <name>                | Name of the FPGA part, e.g. xc7vx690tffg1761-2                          |
+| FPGA_FAMILY                 | <7series,ultraplus>   | Name of the FPGA part family                                            |
+| DATA_WIDTH                  | <8,16,32,64>          | Data width of the network stack in bytes                                |
+| CLOCK_PERIOD                | <nanoseconds>         | Clock period in nanoseconds, e.g. 3.1 for 100G, 6.4 for 10G             |
+| TCP_STACK_MSS               | <value>               | Maximum segment size of the TCP/IP stack                                |
+| TCP_STACK_WINDOW_SCALING_EN | <0,1>                 | Enalbing TCP Window scaling option           |
+| VIVADO_HLS_ROOT_DIR         | <path>                | Path to Vivado HLS directory, e.g. /opt/Xilinx/Vivado/2019.1            |
+
+3. Build HLS IP cores and install them into IP repository
+```
+make installip
+``` 
+
+
+For an example project including the TCP/IP stack or the RoCEv2 stack with DMA to host memory checkout our Distributed Accelerator OS [DavOS](https://github.com/fpgasystems/davos).
+
+
+## Working with individual HLS modules
+
+1. Setup build directory, e.g. for the TCP module
+
+```
+$ cd hls/toe
+$ mkdir build
+$ cd build
+$ cmake .. -DFPGA_PART=xcvu9p-flga2104-2L-e -DDATA_WIDTH=8 -DCLOCK_PERIOD=3.1
+```
+
+2. Run c-simulation
+```
+$ make csim
+```
+
+3. Run c-synthesis
+```
+$ make synthesis
+```
+
+4. Generate HLS IP core
+```
+$ make ip
+```
+
+5. Install HLS IP core into the IP repository
+```
+$ make installip
+```
+
+<a name="interfaces"></a>
+## Interfaces
+All interfaces are using the AXI4-Stream protocol. For AXI4-Streams carrying network/data packets, we use the following definition in HLS:
+```
+template <int D>
+struct net_axis
+{
+	ap_uint<D>    data;
+	ap_uint<D/8>  keep;
+	ap_uint<1>    last;
+};
+```
 
-Make sure that Vivado and Vivado HLS are in your PATH. Use version 2018.1
+<a name="tcp-interface"></a>
+### TCP/IP
 
-Navigate to the _hls_ directory:
+#### Open Connection
+To open a connection the destination IP address and TCP port have to provided through the `s_axis_open_conn_req` interface. The TCP stack provides an answer to this request through the `m_axis_open_conn_rsp` interface which provides the sessionID and a boolean indicating if the connection was openend successfully.
 
-    cd hls
+Interface definition in HLS:
+```
+struct ipTuple
+{
+	ap_uint<32>	ip_address;
+	ap_uint<16>	ip_port;
+};
+struct openStatus
+{
+	ap_uint<16>	sessionID;
+	bool		success;
+};
+
+void toe(...
+	hls::stream<ipTuple>& openConnReq,
+	hls::stream<openStatus>& openConnRsp,
+	...);
+```
 
-Execute the script generate the HLS IP cores for your board:
 
-    ./generate_hls vc709
 
-For the VCU118 run
 
-    ./generate_hls vcu118
+#### Close Connection
+To close a connection the sessionID has to be provided to the `s_axis_close_conn_req` interface. The TCP/IP stack does not provide a notification upon completion of this request, however it is guranteeed that the connection is closed eventually.
 
+Interface definition in HLS:
+```
+hls::stream<ap_uint<16> >& closeConnReq,
 
-Navigate to the _projects_ directory:
+```
 
-    cd ../projects
+#### Open a TCP port to listen on
+To open a port to listen on (e.g. as a server), the port number has to be provided to `s_axis_listen_port_req`. The port number has to be in range of active ports: 0 - 32767. The TCP stack will respond through the `m_axis_listen_port_rsp` interface indicating if the port was set to the listen state succesfully.
 
-Create the example project for your board.
+Interface definition in HLS:
+```
+hls::stream<ap_uint<16> >& listenPortReq,
+hls::stream<bool>& listenPortRsp,
+```
 
-For the Xilinx VC709:
+#### Receiving notifications from the TCP stack
+The application using the TCP stack can receive notifications through the `m_axis_notification` interface. The notifications either indicate that new data is available or that a connection was closed.
 
-    vivado -mode batch -source create_vc709_proj.tcl
+Interface definition in HLS:
+```
+struct appNotification
+{
+	ap_uint<16>			sessionID;
+	ap_uint<16>			length;
+	ap_uint<32>			ipAddress;
+	ap_uint<16>			dstPort;
+	bool				closed;
+};
+
+hls::stream<appNotification>& notification,
+```
 
-For the Alpha DATA ADM-PCIE-7V3:
+#### Receiving data
+If data is available on a TCP/IP session, i.e. a notification was received. Then this data can be requested through the `s_axis_rx_data_req` interface. The data as well as the sessionID are then received through the `m_axis_rx_data_rsp_metadata` and `m_axis_rx_data_rsp` interface.
 
-    vivado -mode batch -source reate_adm7v3_proj.tcl
+Interface definition in HLS:
+```
+struct appReadRequest
+{
+	ap_uint<16> sessionID;
+	ap_uint<16> length;
+};
+
+hls::stream<appReadRequest>& rxDataReq,
+hls::stream<ap_uint<16> >& rxDataRspMeta,
+hls::stream<net_axis<WIDTH> >& rxDataRsp,
+```
 
-For the Xilinx VCU118:
+Waveform of receiving a (data) notification, requesting data, and receiving the data:
 
-    vivado -mode batch -source create_vcu118_proj.tcl
+![signal tcp-rx-handshake](https://svg.wavedrom.com/github/fpgasystems/fpga-network-stack/master/waveforms/tcp-rx-handshake.json5)
 
 
-After the previous command executed, a Vivado project will be created and the Vivado GUI is started.
+#### Transmitting data
+When an application wants to transmit data on a TCP connection, it first has to check if enough buffer space is available. This check/request is done through the `s_axis_tx_data_req_metadata` interface. If the response through the `m_axis_tx_data_rsp` interface from the TCP stack is positive. The application can send the data through the `s_axis_tx_data_req` interface. If the response from the TCP stack is negative the application can retry by sending another request on the `s_axis_tx_data_req_metadata` interface.
 
-Click "Generate Bitstream" to generate a bitstream for the FPGA.
+Interface definition in HLS:
+```
+struct appTxMeta
+{
+	ap_uint<16> sessionID;
+	ap_uint<16> length;
+};
+struct appTxRsp
+{
+	ap_uint<16> sessionID;
+	ap_uint<16> length;
+	ap_uint<30> remaining_space;
+	ap_uint<2>  error;
+};
+
+hls::stream<appTxMeta>& txDataReqMeta,
+hls::stream<appTxRsp>& txDataRsp,
+hls::stream<net_axis<WIDTH> >& txDataReq,
+```
 
-## Testing the example project
+Waveform of requesting a data transmit and transmitting the data. 
+![signal tcp-tx-handshake](https://svg.wavedrom.com/github/fpgasystems/fpga-network-stack/master/waveforms/tcp-tx-handshake.json5)
 
-The default configuration deploys a TCP echo server and a UDP iperf client. The default IP address the board is 10.1.212.209. Make sure the testing machine conencted to the FPGA board is in the same subnet 10.1.212.*
 
-As an intial connectivity test ping the FPGA board by running 
+<a name="roce-interface"></a>
+### RoCE (RDMA over Converged Ethernet)
 
-    ping 10.1.212.209
+#### Load Queue Pair (QP)
+Before any RDMA operations can be executed the Queue Pairs have to established out-of-band (e.g. over TCP/IP) by the hosts. The host can the load the QP into the RoCE stack through the `s_axis_qp_interface` and `s_axis_qp_conn_interface` interface.
 
-After reprogramming the FPGA the first ping message is lost due to a missing ARP entry in the ARP table. However, the FPGA should reply to all following ping messages.
+Interface definition in HLS:
+```
+typedef enum {RESET, INIT, READY_RECV, READY_SEND, SQ_ERROR, ERROR} qpState;
+
+struct qpContext
+{
+	qpState		newState;
+	ap_uint<24> qp_num;
+	ap_uint<24> remote_psn;
+	ap_uint<24> local_psn;
+	ap_uint<16> r_key;
+	ap_uint<48> virtual_address;
+};
+struct ifConnReq
+{
+	ap_uint<16> qpn;
+	ap_uint<24> remote_qpn;
+	ap_uint<128> remote_ip_address;
+	ap_uint<16> remote_udp_port;
+};
+
+hls::stream<qpContext>&	s_axis_qp_interface,
+hls::stream<ifConnReq>&	s_axis_qp_conn_interface,
+```
 
-  
-For the TCP echo server you can use netcat:
+#### Issue RDMA commands
+RDMA commands can be issued to RoCE stack through the `s_axis_tx_meta` interface. In case the commands transmits data. This data can be either originate from the host memory as specified by the `local_vaddr` or can originate from the application on the FPGA. In the latter case the `local_vaddr` is set to 0 and the data is provided through the `s_axis_tx_data` interface.
 
-    echo 'hello world' | netcat -q 1 11.1.212.209 7
+Interface definition in HLS:
+```
+typedef enum {APP_READ, APP_WRITE, APP_PART, APP_POINTER, APP_READ_CONSISTENT} appOpCode;
+
+struct txMeta
+{
+	appOpCode 	op_code;
+	ap_uint<24> qpn;
+	ap_uint<48> local_vaddr;
+	ap_uint<48> remote_vaddr;
+	ap_uint<32> length;
+};
+hls::stream<txMeta>& s_axis_tx_meta,
+hls::stream<net_axis<WIDTH> >& s_axis_tx_data,
+```
+Waveform of issuing a RDMA read request:
+![signal roce-read-handshake](https://svg.wavedrom.com/github/fpgasystems/fpga-network-stack/master/waveforms/roce-read-handshake.json5)
 
-Alternatively, you can use the _echoping_ linux commandline tool.
+Waveform of issuing an RDMA write request where data on the FPGA is transmitted:
+![signal roce-write-handshake](https://svg.wavedrom.com/github/fpgasystems/fpga-network-stack/master/waveforms/roce-write-handshake.json5)
 
-For the TCP and UDP iperf test, see [here](http://github.com/dsidler/fpga-network-stack/wiki/iPerf-Benchmark). 
 
 
-## Configuration
+<a name="interfaces"></a>
+## Benchmarks
+(Coming soon)
 
-Coming soon
 
+<a name="publications"></a>
 ## Publications
 - D. Sidler, G. Alonso, M. Blott, K. Karras et al., *Scalable 10Gbps
 TCP/IP Stack Architecture for Reconfigurable Hardware,* in FCCM’15, [Paper](http://davidsidler.ch/files/fccm2015-tcpip.pdf), [Slides](http://fccm.org/2015/pdfs/M2_P1.pdf)
 
 - D. Sidler, Z. Istvan, G. Alonso, *Low-Latency TCP/IP Stack for Data Center Applications,* in FPL'16, [Paper](http://davidsidler.ch/files/fpl16-lowlatencytcpip.pdf)
 
+
+<a name="citation"></a>
 ## Citation
 If you use the TCP/IP stack in your project please cite one of the following papers and/or link to the github project:
 ```
@@ -95,6 +310,19 @@ If you use the TCP/IP stack in your project please cite one of the following pap
 	booktitle={FPL'16}, 
 	title={{Low-Latency TCP/IP Stack for Data Center Applications}}, 
 }
+@PHDTHESIS{sidler2019innetworkdataprocessing,
+	author = {Sidler, David},
+	publisher = {ETH Zurich},
+	year = {2019-09},
+	copyright = {In Copyright - Non-Commercial Use Permitted},
+	title = {In-Network Data Processing using FPGAs},
+}
 ```
 
-For more information please visit the [wiki](http://github.com/dsidler/fpga-network-stack/wiki)
+<a name="contributors"></a>
+## Contributors
+- [David Sidler](http://github.com/dsidler), [Systems Group](http://systems.ethz.ch), ETH Zurich
+- [Monica Chiosa](http://github.com/chipet), [Systems Group](http://systems.ethz.ch), ETH Zurich
+- [Mario Ruiz](https://github.com/mariodruiz), HPCN Group of UAM, Spain
+- [Kimon Karras](http://github.com/kimonk), former Researcher at Xilinx Research, Dublin
+- [Lisa Liu](http://github.com/lisaliu1), Xilinx Research, Dublin