SLIDE 1 Geoapplications development http://rgeo.wikience.org
Higher School of Economics, Moscow, www.cs.hse.ru
SLIDE 3 Agenda (cont.)
3
→ → →
SLIDE 4 Netty
4
Netty is a NIO client-server framework, which enables quick and easy development of network applications, such as protocol servers and clients.
SLIDE 5 Interactive mapping service
5
Problem:
- 1. Web application displays an interactive map in a browser
- 2. It views additional layers on top of the map, say air temperature
- 3. User pans and zooms in/out the map (applies synchronously for all layers )
- 4. Client (software) cannot store all data at its end (e.g., due to data volume)
- 5. Thus, client must be able to dynamically obtain portions of data to display
for a given geographical area and scale (after each pan/zoom) The typical model is as follows (more details are being added slide by slide):
Internet
SLIDE 6 Network protocol & messages
6
- 1. We need to exchange data over network, we want a primitive for this
- 2. Call it message – data that can be sent/received over network
- 3. Think of it as a mail/parcel, you can fill it in with data or command
- 4. “message” = client query or server response
- 5. Network protocol – fixed set of possible message types, their format
(internal structure) and, possibly, rules (in what order certain types of messages are allowed to be sent)
Internet
SLIDE 7 Protocol definition (in Russian)
Протокол передачи данных — набор соглашений, которые определяют обмен данными между различными программами. Сетевой протокол — набор правил и действий (очерёдности действий), позволяющий осуществлять соединение и обмен данными между двумя и более включёнными в сеть устройствами.
In English: https://en.wikipedia.org/wiki/Communications_protocol
SLIDE 8 WCS (Web Coverage Service)
8
1. Retrieve raster data over network from a server that supports WCS 2. Built on top of HTTP 3. Allows to subset a given dataset (by time, extent, etc.) 4. Raster is returned as Coverage – raster with its metadata (extent, projection, etc.), ISO standard: https://en.wikipedia.org/wiki/Coverage_data 5. Key command: GetCoverage 6. Set of supported output formats depends on server implementation, usually GeoTIFF, NITF, HDF, JPEG, JPEG2000, PNG.
Internet GeoTIFF, PNG, …
SLIDE 9 OSI model
9
What is “on top of HTTP”?
SLIDE 10
OSI model: protocols
SLIDE 11 WCS – GetCoverage
11
GetCoverage is an HTTP GET request (spaces are not allowed, they are for clarity only!)
<URL> ? <key> = <value> {& <key> = <value>}* BNF style /thredds/wcs/galeon/striped.nc ?request=GetCoverage &version=1.0.0 &service=WCS &format=GeoTIFF &coverage=ta &time=2005-05-10T00:00:00Z &vertical=100.0 &bbox=-134,11,-47,57 Server URL Command name WCS protocol version Server-specific (use WCS) Output format Dataset name Time (subset) Vertical level (subset) Bounding box (extent, subset)
https://www.unidata.ucar.edu/software/thredds/current/tds/reference/WCS.html
SLIDE 12 WCS – more details
12
Pros: very simple, output format contains a lot of metadata Cons: formats too complex for web applications (only PNG usually used) Desktop GIS (e.g. Quantum GIS) easily accept NetCDF and other formats Situation changes now: geotiffjs, xdrjs, … Current versions: 1.0.0, 1.1.0, 1.1.1, 1.1.2 and 2.0.1. Types of queries (except GetCoverage – see prev. slides)
- GetCapabilities – get XML with available coverages, supported
- perations (usually GetCoverage and DescribeCoverage), some
server metadata <server_url>?request=GetCapabilities&version=1.0.0&service=WCS
- DescribeCoverage – get XML with metadata for a given coverage
<server_url>?request=DescribeCoverage&version=1.0.0&service=WC S&coverage=ta
SLIDE 13 WCS – TDS implementation
13
Again: WCS is a simple protocol with some very complex output formats
- Clone or browse https://github.com/Unidata/thredds
- See readings section at the end
SLIDE 14 WFS – Web Feature Service
14
≈ WCS, but for vector data with some minor differences:
- supports both reading and writing data (bidirectional)
- provides basic processing capabilities (filtering,
reprojection, etc.)
- Output formats: GML, GeoJSON, CSV, etc.
- etc.: see links in readings
WPS – Web Processing Service
Typical workflow: Get capabilities, Submit task, Poll for completeness, Retrieve Results The name tells for itself, we do not delve into, see readings section
SLIDE 15 WMS – Web Mapping Service
15
One of the first geospatial protocols
- Basic notions are renderer, layer, style
- Delivers images rendered according to given style (e.g.
PNG, but PDF, SVG, SWF may also be supported as output formats)
- Vector data may be one of the layers and rendered as
images as well
- Allows basic processing like reprojection “on the fly”
Pros: OK both for web and desktop apps Cons: very slow
SLIDE 16 Playing with WCS, WMS, OpenDAP metadata
16
Climate Forecast System Reanalysis data are also published with TDS (this server will be covered later) which supports those protocols: http://nomads.ncdc.noaa.gov/data.php?name=access#CFSR-data
SLIDE 17 Event-driven architecture
17
- 1. The moments at which user pans or zooms are unknown
- 2. Thus, the moments at which server receives requests are also unknown
- 3. Let’s generalize user activity: call pan/zoom/etc. as GUI events
- 4. Let’s generalize network I/O: client receives a response = I/O event, server
receives request = I/O event.
- 5. We will use on<EventName>(<params>) notation for GUI events
- 6. We will use onMessageReceived(Message msg) for network events
- nPan(double distance)
- nZoom(int level)
Internet
- nMessageReceived (Message msg)
- nMessageReceived
(Message msg)
SLIDE 18 Event-driven architecture (2)
18
Q: what another events do we need?
- nConnect, onDisconnect, onIdle, etc.
- I/O pattern:
- connect, query* N times, disconnect
- I/O pattern for connectionless protocols (e.g. HTTP):
- connect, 1 query, disconnect (overhead for connecting each time,
but there are keepAlive and similar optimizations)
- Abrupt disconnect, connection time outs, disconnect on idle
Q: what reacts on events?
- Event handlers
- They are methods named after event names: onConnect, etc.
- Called by a framework (yours or 3rd party) when event occurs
SLIDE 19 Client and server (asynchronous, keep conn.)
19
Server Client
startup { listen(IP:PORT) } // event handlers
- nMessageReceived(Message msg) {
if (msg instanceof GetCoverage) { // not best way // to check message type } }
- nConnect(clientIP, clientPort) {
LOG.info(…); }
connect(IP:PORT) // connects to server // if not already connected sendQuery(msg) // asynchronous: // immediately exits } // I/O event handler
- nMessageReceived(Message msg) {
if (msg instanceof ResponseMsg) { } }
SLIDE 20 Multi-user service
20
- In real world, there are 100s or 1000s of concurrent clients
- They generate 100s of queries per sec
- Server is a multithreaded application, each thread is assigned a query
upon its arrival (typically threads# = k*CPU cores)
- A load balancer, query queues may be used; Q: why?
Internet
time
thread thread occupied with query processing
SLIDE 21 Sockets
21
- How to maintain connections from 1000s of users?
- Blocking (wait an operation to complete) and non-blocking sockets:
https://www.scottklement.com/rpg/socktut/nonblocking.html
SLIDE 22 Server
22
public class Main extends Thread { static int server_port = 1000; int mode = 0; static BufferedReader bir = null; static BufferedWriter biw = null; public static void main(String[] args) { try { ServerSocket ss = new ServerSocket(server_port); Socket s = ss.accept(); bir = new BufferedReader( new InputStreamReader(s.getInputStream())); biw = new BufferedWriter( new OutputStreamWriter(s.getOutputStream()));
SLIDE 23 Server
23
ExecutorService execs = Executors.newFixedThreadPool(2); execs.submit(new Main(0)); execs.submit(new Main(1)); bir.close(); biw.close(); ss.close(); } catch (IOException ex) { ex.printStackTrace(); } // execs.shutdown(); }
SLIDE 24 Server
24
public Main(int mode){ this.mode = mode; } public void run(){ try { String line = null; while ( (line = bir.readLine()) != null ){ String ready = DateFormat.getDateTimeInstance().format(new Date()); System.out.println(line); System.out.println(ready); } } catch (IOException ex) { ex.printStackTrace(); } } }
SLIDE 25
Why do we need buffering (and small messages)?
by Jeffry Dean (Google)
MTU
SLIDE 26 Client
26
public class Main extends Thread { static final int server_port = 1000; int mode = 0; // send/receive static BufferedWriter bf = null; static BufferedReader br = null; public static void main(String[] args) { try { Socket s = new Socket("127.0.0.1", server_port); bf = new BufferedWriter(new OutputStreamWriter(s.getOutputStream())); br = new BufferedReader(new InputStreamReader(s.getInputStream()));
SLIDE 27 Client
27
ExecutorService execs = Executors.newFixedThreadPool(2); execs.submit(new Main(0)); execs.submit(new Main(1)); br.close(); bf.close(); s.close(); } catch (UnknownHostException ex) { ex.printStackTrace(); } catch (IOException ex) { ex.printStackTrace(); } }
SLIDE 28 Client
28
public Main(int mode){ this.mode = mode; } public void run(){ try { switch (mode) { case 0: // send for (int i=0; i< 10; i++){ bf.write("I am 007" + i); } break; case 1: // read for (int i=0; i< 10; i++){ System.out.println("" + i + ": " + br.readLine()); } break; } } catch (IOException ex) { ex.printStackTrace(); } } }
SLIDE 29 Real-world messaging
29
- A message from an OSI application layer is represented as a sequence
- f bytes on lower OSI levels
- When a message is too large to fir into a single network packet, it is
split onto several packets (which are also called messages)
- When parts of message are transferred over network, the order and
the absence of corruption are guaranteed by TCP Internet 1 2 3 4
split send receive 3 2 4 1 We must wait for each part of a message, able to discriminate between messages from distinct users, accumulate fragments if the current size is not enough to decode the message
SLIDE 30 Netty
30
Netty is a NIO client-server framework, which enables quick and easy development of network applications, such as protocol servers and clients.
SLIDE 31 Readings
31
<!-- https://mvnrepository.com/artifact/io.netty/netty-example --> <dependency> <groupId>io.netty</groupId> <artifactId>netty-example</artifactId> <version>${netty.version}</version> </dependency>
SLIDE 32 Netty architecture
32
SLIDE 33 Netty: crash course
33
Bootstrap or ServerBootstrap EventLoop EventLoopGroup ChannelPipeline Channel Future or ChannelFuture ChannelInitializer ChannelHandler
SLIDE 34 Netty: ServerBootstrap
34
// Configure the server. EventLoopGroup bossGroup = new NioEventLoopGroup(1); EventLoopGroup workerGroup = new NioEventLoopGroup(); try { ServerBootstrap b = new ServerBootstrap(); b.group(bossGroup, workerGroup) .channel(NioServerSocketChannel.class) .handler(new LoggingHandler(LogLevel.INFO)) .childHandler(new HttpServerInitializer(sslCtx)); // sslCtx = null if no SSL Channel ch = b.bind(PORT).sync().channel(); ch.closeFuture().sync(); } finally { bossGroup.shutdownGracefully(); workerGroup.shutdownGracefully(); }
SLIDE 35 BossGroup and WorkerGroup
35
BossGroup – accept incoming connections, register channels WorkerGroup – process I/O for channels Network
SLIDE 36 Message lifecycle
36
Typical pattern for message processing : Decode or unpack message, Process message, Encode for network I/O Network
SLIDE 37 Netty: ChannelInitializer
37
public class HttpServerInitializer extends ChannelInitializer<SocketChannel> { private final SslContext sslCtx; public HttpServerInitializer(SslContext sslCtx) { this.sslCtx = sslCtx; } @Override public void initChannel(SocketChannel ch) { ChannelPipeline p = ch.pipeline(); if (sslCtx != null) { p.addLast(sslCtx.newHandler(ch.alloc())); } p.addLast(new HttpRequestDecoder()); p.addLast(new HttpResponseEncoder()); p.addLast(new HttpServerHandler()); } }
SLIDE 38 Netty: thread model
38
A channel is created per connection. Once a Channel is assigned to a thread it will use this thread throughout its lifetime. So, a channel cannot have more than one IO thread assigned to it at any one time. All ChannelHandlers (business logic) are guaranteed to be executed by a single thread at the same time for a specific Channel.
SLIDE 39 ChannelPipeline
39
All handlers for a channel (both in- and out-) reside in the same single
- pipeline. Netty decides whether to use a particular handler for in- or
- ut- traffic by checking “implements Channel{In, Out}boundHandler”
SLIDE 40 Netty: Handlers
40
Everything is a handler in Netty ChannelPipeline. But they are logically divided onto:
- Encoders/Decoders
- Business logic handlers
- Other types of handlers
SLIDE 41 Netty: Futures
41
All logic is asynchronous To avoid blocking, Future is returned Example:
ch.closeFuture().sync();
SLIDE 42 In/Out handlers
42
Adapters
SLIDE 43 Duplex handlers
43
Notice adapters
SLIDE 44 Netty: Business Logic Handlers
44
Only up to protected visibility level shown for HttpServerHandler
SLIDE 45 Netty: Channel Attributes
45
Keep state from message to message
final AttributeKey<Boolean> userConnected = AttributeKey.valueOf("uconnected"); Attribute<Boolean> attr = ctx.attr(userConnected); Boolean isConnected = attr.get(); if ( null == isConnected ) { // attribute has not been set yet } attr.set(Boolean.TRUE);
SLIDE 46 Netty and Mina
46
Frameworks help organize multithreading, message I/O, have generic handlers, message encoders/decoders, etc. Mina is older than Netty, but is still in use and development http://netty.io/ https://mina.apache.org/ Both Netty and Mina were designed by one author
SLIDE 47 ChronosServer: Apache Mina Experience
47
high performance data dissemination
The only system with distributed in-situ file-based access to date
SLIDE 48 ChronosServer abstraction layers
Time series
▲ User view Reality ►
N files of diverse formats, naming on K cluster nodes, replicated SELECT DATA FROM r2.wind.10m.u WHERE TIME = 01.01.2000 00:00
SLIDE 49 Internet Gate
ChronosServer: query execution (shown for one client)
Climate Wikience
Cache
QE Instruction Data
1 Issue query SELECT DATA FROM r2.pressure.msl WHERE TIME = 01.01.2003 00:00
1 1 2
Parse query
3
Find nodes with data
4 Select node 5
Send query parameters
6
Find file and read data fragment from it
7 7
Cluster nodes: ChronosServer Result delivery
SLIDE 50 Computer cluster*
50
www.wikience.org/ru/ХроносСервер/ (с) Antonio Rodriges
6 + 2 + 2 nodes
- 24 terabytes HDD space
- 1 Gb local network
- 1 Gb Internet (optic fibre)
* now ChronosServer runs on VPS Analogs: TDS, ERDDAP, GeoServer, ArcGIS Image Server – not truly distributed SELECT DATA FROM r2.pressure.msl WHERE TIME_INTERVAL = 01.01.2004 00:00 – 01.01.2006 00:00 AND REGION = (-90, -180, +90, +180)
SLIDE 51
Interactive 3D visualization
Climate Wikience – front end of ChronosServer
SLIDE 52 SciDB
52
SciDB – Scientific DB NoSQL AQL – Array Query Language AFL – Array Functional Language Distributed General-purpose multidimensional array DBMS
https://en.wikipedia.org/wiki/Michael_Stonebraker
SLIDE 53 Summary table
Operation Execution time, seconds Ratio, SciDB / ChronosServer SciDB ChronosServer Cold Hot Cold Hot Data import 720.13 19.82 7.96 36.33 90.47 Max 13.46 4.43 3.10 3.04 4.34 Min 12.87 4.71 3.33 2.73 3.86 Average 21.42 4.71 3.23 4.55 6.63 Wind speed calc. 25.75 3.50 2.10 7.36 12.26 Chunk 100×20×16 56.19 1.68 0.374 33.45 150.24 Chunk 10×10×8 222.11 1.98 1.15 112.18 193.14
On average, ChronosServer is 3x to 193x faster SciDB
http://doi.org/10.13140/RG.2.2.26922.21444
SLIDE 54 Custom protocols
54
SLIDE 55 Google Protocol Buffers
55
Google Protocol Buffers and its successors Apache Avro, Thrift are machine and language independent data serialization systems for high performance network communication. They are language-neutral, platform-neutral extensible mechanisms for serializing structured data https://developers.google.com/protocol-buffers/ A simple grammar is provided to define messages – data structures containing a set of fields each of a predefined data type (string, array, integer, etc.) or other message. The definitions are used to generate classes representing Protobuf messages and serialization/deserialization code for them in a given programming language.
SLIDE 56 WRRS
56
https://github.com/Wikience/WRRS-JS
SLIDE 57 WRRS.proto
57
package org.wikience.wrrs.wrrsprotobuf;
- ption java_outer_classname = "RProtocol";
var wrrsprotobuf = dcodeIO.ProtoBuf.newBuilder({})['import']({ "package": "org.wikience.wrrs.wrrsprotobuf", "options": { "java_outer_classname": "RProtocol" },
JavaScript Java
package org.wikience.wrrs.wrrsprotobuf; public final class RProtocol {
SLIDE 58 Simple message
58
message ConnectRequest {
- ptional int32 clientID = 1;
- ptional int32 protocolVersion = 2;
- ptional bool retrieveDatasetTree = 3;
}
var connectRequest = new self.PROTOBUF.ConnectRequest(); connectRequest.setProtocolVersion(….); connectRequest.setRetrieveDatasetTree(….); self.socket.onmessage = ….; self.socket.send(connectRequest.toArrayBuffer());
JavaScript
byte[] rawMsg; RProtocol.ConnectRequest connectReq = RProtocol.ConnectRequest.parseFrom(rawMsg);
Java
SLIDE 59 Default clause
59
message IncludeRequestMeta {
- ptional bool includeParams
= 1 [default = false];
- ptional bool includeAttributes = 2 [default = true];
- ptional bool includeDimensions = 3 [default = true];
- ptional bool includeRasterData = 4 [default = true];
} message TLatLonBox {
- ptional double latitudeNorth = 1 [default = -90.0];
- ptional double latitudeSouth = 2 [default = 90.0];
- ptional double longitudeEast = 3 [default = -180.0];
- ptional double longitudeWest = 4 [default = 180.0];
}
SLIDE 60 Repeated
60
message RasterData { repeated double data = 1; } Treated as optional Adding data: RProtocol.RasterData.Builder arrBuilder = RProtocol.RasterData.newBuilder(); double temp = ...; arrBuilder.addData(temp);
SLIDE 61 Response from Java
61
message RasterResponse {
responseStatus = 1;
- ptional RequestResponseMeta requestResponseMeta
= 2;
requestParams = 3;
- ptional RasterAttributes rasterAttributes
= 4;
- ptional RasterDimensions rasterDimensions
= 5;
rasterData = 6; }
RProtocol.RasterResponse.Builder rrB = RProtocol.RasterResponse.newBuilder(); RProtocol.ResponseStatus.Builder statB = RProtocol.ResponseStatus.newBuilder(); statB.setCode(1); statB.setMessage(e.getMessage()); rrB.setResponseStatus(statB.build()); byte[] response = rrB.build().toByteArray();
SLIDE 62 WARNING
62
It is impossible to find out what Protobuf message type we have received We must know what message type we are going to receive
SLIDE 63 Protobuf: compile
63
SET DST_DIR_JAVA="d:/RServer/src/main/java" SET SRC_DIR_PROTOBUF="d:/RProtocol/protocol/" SET DST_DIR_JS="d:/RProtocol/protocol/" REM Compile Java version protoc-2.6.1 --java_out=%DST_DIR_JAVA% WRRS.proto REM Compile JavaScript version c:/nodejs/pbjs %SRC_DIR_PROTOBUF%WRRS.proto
- e org.wikience.wrrs.wrrsprotobuf -t js >
%DST_DIR_JS%WRRS.proto.js
SLIDE 64 JavaScript: connect via WebSocket
64
self.socket = new WebSocket(URL); self.socket.binaryType = "blob"; self.socket.onopen = self.onSocketOpenCallback; self.socket.onclose = self.onSocketCloseCallback; self.socket.onerror = self.onSocketError;
SLIDE 65 JavaScript: send via WebSocket
65
var connectRequest = new self.PROTOBUF.ConnectRequest(); connectRequest.setProtocolVersion(self.VERSION); connectRequest.setRetrieveDatasetTree(self.RETR_DATASETS_TREE); self.socket.onmessage = self.onRasterResponse ; self.socket.send(connectRequest.toArrayBuffer());
SLIDE 66 JavaScript: read data via WebSocket
66
self.onRasterResponse = function (event) { var reader = new FileReader(); reader.onload = function () { var uint8Array = new Uint8Array(this.result); var response = {}; try { response = self.PROTOBUF.RasterResponse.decode(uint8Array); } catch (err) { self.logErr("Corrupted protobuf " + err); return; } }; reader.readAsArrayBuffer(event.data); };
SLIDE 67 9 fallacies of distributed computing
67
1. The network is reliable. 2. Latency is zero. 3. Bandwidth is infinite. 4. The network is secure. 5. Topology doesn't change. 6. There is one administrator. 7. Transport cost is zero. 8. The network is homogeneous. 9. Location is irrelevant. 1 – 7: Peter Deutsch, 1994 8: James Gosling, 1998 9: ?, 2009
https://pages.cs.wisc.edu/~zuyu/files/fallacies.pdf https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing https://blogs.oracle.com/hjfphd/entry/the_9th_fallacy_of_distributed
SLIDE 69 Readings
69
- http://live.osgeo.org/ru/standards/wps_overview.html
SLIDE 70 Readings
70
- Google Protocol Buffers, http://developers.google.com/protocol-
buffers/
- Apache Thrift, https://thrift.apache.org/
SLIDE 71 Practice
71
Use Netty as a framework http://netty.io/wiki/user-guide-for-4.x.html Use Java serialization or Google Protocol Buffers or Apache Avro Build your custom server and client using custom protocol to exhange messages: Java Server → Java Client
SLIDE 72 Backup slide: another picture of generic architecture
72
SLIDE 73 Mina architecture
73
SLIDE 74
MinaTimeServer
import java.io.IOException; import java.net.InetSocketAddress; import java.nio.charset.Charset; import org.apache.mina.core.service.IoAcceptor; import org.apache.mina.core.session.IdleStatus; import org.apache.mina.filter.codec.ProtocolCodecFilter; import org.apache.mina.filter.codec.textline.TextLineCodecFa import org.apache.mina.filter.logging.LoggingFilter; import org.apache.mina.transport.socket.nio.NioSocketAccepto
SLIDE 75
MinaTimeServer
public class MinaTimeServer { private static final int PORT = 9123; public static void main( String[] args ) throws IOException { IoAcceptor acceptor = new NioSocketAcceptor(); acceptor.getFilterChain().addLast( "logger", new LoggingFilter() ); acceptor.getFilterChain().addLast( "codec", new ProtocolCodecFilter( new TextLineCodecFactory( Charset.forName( "UTF-8" ))));
SLIDE 76
MinaTimeServer
acceptor.setHandler( new TimeServerHandler() ); acceptor.getSessionConfig(). setReadBufferSize( 2048 ); acceptor.getSessionConfig(). setIdleTime(IdleStatus.BOTH_IDLE, 10); acceptor.bind( new InetSocketAddress(PORT) ); } }
SLIDE 77
TimeServerHandler
import java.util.Date; import org.apache.mina.core.session.IdleStatus; import org.apache.mina.core.service.IoHandlerAdapter; import org.apache.mina.core.session.IoSession;
SLIDE 78
TimeServerHandler
public class TimeServerHandler extends IoHandlerAdapter { @Override public void exceptionCaught( IoSession session, Throwable cause ) throws Exception { cause.printStackTrace(); }
SLIDE 79
TimeServerHandler
@Override public void messageReceived( IoSession session, Object message ) throws Exception { String str = message.toString(); if( str.trim().equalsIgnoreCase("quit") ) { session.close(); return; } Date date = new Date(); session.write( date.toString() ); System.out.println("Message written..."); }
SLIDE 80
TimeServerHandler
@Override public void sessionIdle( IoSession session, IdleStatus status ) throws Exception { System.out.println( "IDLE " + session.getIdleCount( status )); } }
SLIDE 81
Test
Client Output Server Output user@myhost:~> telnet 127.0.0.1 9123 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. hello Mon Apr 09 23:42:55 EDT 2007 quit Connection closed by foreign host. user@myhost:~> MINA Time server started. Session created... Message written...
SLIDE 82