 
              A bit of background ● When the Urban Challenge got rolling, a software framework had to be chosen – ModUtils from Grand Challenge/NavLab – CMU IPC from … everything – A NIST package used over at NREC – Third party packages
ModUtils ● Upside: It did everything – Grand Challenge/NavLab provide examples that it is capable ● Downside: It did everything – It was written in the wild west – Internal implementations of marshalling, comms, log file format, mini-STL, etc – Tracking down bugs is intimidating ● Minimal documentation – Developed reputation for having a steep learning curve
CMU IPC ● Develop programs around IPC – Ad-hoc, simple infrastructure for relatively small systems – Support packages like MicroRaptor (a process management system) ● Maybe a bit spartan
NIST package ● Stodgy, bureaucratic mess – Lots of arcane text files to configure minor details ● Not particularly development friendly ● No mandate, don't bother
Third Party Systems ● Basic/unsuitable – No evidence of high performance uses – Design choices that are obviously not suitable performance-wise – Limited capabilities ● Incomplete – First have to understand system, then extend it ● Incompatible – Not shopping for a new model, just an implementation
Framework of Reuse ● Take ModUtils model, re-implement with a priority on simplicity and reuse – Config files - ruby – Marshalling – boost::serialization – Comms – CMU IPC – Log file format – Berkeley DB – UI – QT integrated with interface mechanism – task library to glue it all together
task library ● A software jig – Performs common functions all tasks require ● Uses ruby to evaluate a script that results in configuration values – Ruby can either contain static values, or dynamically generate values based on external stimuli ● Instantiates and configures interfaces – C++ virtual classes, implementations dynamically loaded at run time depending on configuration
Interfaces ● At the heart of the interface base class is a Channel – Not every interface uses the channel, as on the perimeter of the system they actually interact with hardware, etc ● But the bulk of interfaces are conduits to remote interfaces or other tasks ● TypedChannel<T> reads/writes instances of type T. – It internally marshalls or unmarshalls T to a byte stream with boost::serialization ● Channel just acts on std::string's
Why develop SimpleComms? ● Started with IPC – Readily available ● Original authors are local ● Team members have had prior experience – Simple to use ● Launch central on one host ● Set environmental variable CENTRALHOST to that host on each machine ● IPC_connect, IPC_subscribe, IPC_publish, IPC_handleMessage
Why not stick with IPC? ● All communications routed through one daemon in central mode. – Multiple trips across network – Serializes everything thru one bottleneck Central Mode Machine Two Machine Three Machine One T1 T1 T2 T2 T1 T3 T2 T4 T5 T1 T6 T2 Central
Limits of IPC ● There is a direct mode – It *cough* works *cough* – Still necessitates multiple network trips per publish Direct Mode Machine Two Machine Three Machine One T1 T1 T2 T2 T1 T3 T2 T4 T5 T1 T6 T2 Central
Limits of IPC ● If the intermediate buffers are nearly full, it's possible to enter a dead lock where both parties are writing. Dead Lock X Central Task
SimpleComms Topography ● Local routing daemons on each machine – Frees tasks from network comms/multiple deliveries – One delivery per host SimpleComms Machine One Machine Two Machine Three T1 T2 T3 T4 T5 T6 SCS SCS SCS
SCS Internal Message Format ● Externally, SCS accepts/conveys std::string's of any length ● Internally, messages are segmented to a bit less than 64k – Trade off between throughput and connectivity – Compatible with UDP
SimpleComms Protocols ● Local communication is done with Unix Domain Sockets in datagram mode – Originally selected for the simplicity of having structured connectivity – Each task and SCS binds to a socket in Linux's virtual name-space to receive messages – Down side with datagrams is discrepancy in criteria used by select and write (# of outstanding messages versus bytes) – Blocking I/O is used to avoiding spinning between select/write. – The workaround was sufficient, otherwise SCS would have been transitioned to stream mode
SimpleComms Protocols ● Broadcast UDP is used to for discovery – “zero conf” as the SCS daemons discover each other and automatically connect – Default port, and ethernet interface used for broadcast are the points of configurability ● Changing the port allows running independent networks, e.g. for simulations ● Binding to a particular interface permits keeping SCS on different robots from mixing – eth0 is masked /16, the facility network – eth0:1 is masked /24, limited to robot
SimpleComms Protocols ● Developed both UDP and TCP modules for inter-host connectivity – Concerns about TCP delays from dropped packets and retransmits not really an issue on robot w/ high quality ethernet hardware, completely integrated in the chassis – Losing messages to UDP, or developing retransmit logic never became a priority – A pair of TCP connections between each machine, used as one way conduits ● Side effect of concurrent discovery
SimpleComms Multi-threading ● SCS has two primary threads – A reading thread polls for input and distributes messages to subscribed queues – A writing thread empties the queues when sockets flag available Dedicated threads servicing buffers SCS Read Task Unix Write Client
SimpleComms Multithreading ● Client library contains a receiving thread – Services the incoming messages even when the task is busy – Messages are sent immediately Dedicated threads servicing buffers SCS Read Task Unix Write Client
Vectored I/O ● Minimize in memory copies – Use vectored I/O to deliver directly to destination Vectored I/O Header Copy #1 Copy #2 Fragment Fragment recv Peek Header recvmsg sendmsg One Copy Fragment
Recommend
More recommend