 
              Technical changes since the last Tor talk Nick Mathewson The Tor Project <nickm@torproject.org> Defcon XV Aug 4, 2007
Marty! We've got to go back to the future2004! ● Tor was working, usable, and seemed pretty secure. (v 0.0.7.2) ● Pretty small network. ● No GUI—hard to use. ● We got a couple of Defcon talks!
What we've been up to since then. ● Hacking on Tor.(Latest is 0.2.0.4-alpha) – Security: adding features/fixing security bugs. – Scalability: adding capacity is hard. – Scalability: using capacity is hard. – Usability: adding GUIs, fixing bugs. – Integration: working nice with other apps is hard. – Lots more: See the changelog. ● Growing the network: ~200kuser, ~1kserver.
Outline ● Prelude: brief, fast introduction to Tor ● Directories and server discovery changes: More secure, more scalable! ● Path generation changes: More efficient, less filling! ● Circuit-building protocol changes: Oops. Crypto is hard. ● Some fun new tools and features: What do you mean, I need to edit a file?
Intro anonymity: anonymity networks hide users among users. Bob1 Alice1 Network Bob2 Alice2 Alice3
Intro Tor:There are a bunch of servers, connected via TLS (ssl). S S S S S S S S S
Intro Tor:clients build circuits through a network of decrypting relays. 3. S S 2. 1. Alice1 S S S S Alice2 S S S
Intro Tor:circuits are used to relay multiple TCP streams. 4. 3. 5. S S Bob1 2. 6. 1. Alice1 S S S S Bob2 Alice2 See also: S S S PipeNet, Onion Routing
A hostile first hop can tell Alice is talking, but not to whom. S S Bob1 Alice1 S S S S Bob2 Alice2 S S S
A hostile last hop can tell somebody is talking to Bob, but not who. S S Bob1 Alice1 S S S S Bob2 Alice2 S S S
But: two hostile hops can correlate traffic patterns and link Alice to Bob. S S Bob1 Alice1 S S S S Bob2 Alice2 No obvious S S S fix that isn’t extra-slow.
I. Directories and server discovery
We need to tell clients about servers. ● Every client must know every server. – (If you just ask a server for a list of neighbors, it can trivially lie.) ● All clients must know the same servers. ● Servers shouldn’t be able to impersonate each other. – (Use self-signed descriptions; identity by PK.) ● Bandwidth matters a lot.
Server discovery is hard because misinformed clients lose anonymity. Known to Alice1 S S Bob1 Alice1 S S S S Alice2 S Bob2 S S Known to Alice2
2004: every authority published a big list of server information. That was slow. S1 Client Authority S2 Client Authority ... .... Authority Client Sn
Adding caches helped with performance... S1 Client Cache Authority S2 Cache Client Authority ... Cache .... Authority Sn Cache Client
But a single bad authority could still break clients badly... S1 Client Cache Authority S2 Cache Client Authority ... Cache .... Authority Sn Cache Client
And most information was redundant. Client Cache “What's the directory?” Sign(Desc1,Desc2,Desc3..Desc99) “What's the directory?” Sign(Desc1,Desc3..Desc99,Desc100)
So split directory into status (signed) and individual descriptors Client Cache “What do authorities A and B say?” SignA(digest list), SignB(digest list) “Send me descriptor with digest X” Descriptor with digest X Remaining Problems: partitioning, redundancy. (2005)
Naming and requesting descriptors by digest prevents attacks. ID = X “Use server whose identity key is X”. S1 Authorities Cache “ H e r e ’ s o n e j u s t f o r y o u ! ” Client
Authorities now vote on a single consensus status document. 1. Distribute signed opinions. 2. Compute result of vote, S1 and sign it. Authority 3. Distribute signatures; make multi-signed document. S2 4. Clients check signatures. 5. Profit! Authority ... (2007) Sn Authority
Authorities say more than “yes/no” for each server. ● Named? Authority? ● Running? Guard? ● Valid? ● Fast? (Actually determining these can be hard. ) ● Stable? ● Bad exit? (Keywords define client behavior; authorities ● Exit? improve criteria.)
II. Path generation
2004: all servers chosen with equal* probability, regardless of capacity. Big servers were bw=x underused. bw=x S1 Client bw=4x bw=x bw=x bw=2x Tiny servers bw=2x p=2x bw=x/2 were overloaded. bw=x/2
Now: Bandwidth is not uniform, so don't select uniformly. p=x p=x S1 Client p=4x p=x p=x p=2x p=2x p=2x p=x/2 p=x/2
(But cap the maximum to prevent trust bottlenecks.) p=x p=x S1 Client p=x p=2x “I can push a p=2x p=2x terabit. No, really!”
Unstable servers are useful, but not for (SSH, IM, ...) Client 10 days 1 hour 10 days 1 hour 10 days
Use long-lived servers for long-lived connections. O k a y f o r p o r t 2 2 Client . 10 days 1 hour 10 days 1 hour 10 days
Our original “random” path-selection approach made sure that every client would eventually be profiled. Alice loses if first and last hop are evil. (Correlation attacks) Suppose c/n nodes (bandwidthwise) are compromised. Therefore, (c/n)^2 of Alice's circuits are compromised. Therefore, if Alice's behavior stays the same, she will eventually lose.
Tor clients now use “guard” servers to give long-term Alice a chance. S S S Alice S S S Chosen at random*, held fixed**. If Alice’s guards are good, Alice never has a vulnerable path.
Okay, so guard nodes might go down. X S S
So add more as needed, but keep them in order... X S S S S
...so we can go back to the original set when they come back online. S S
Old Tor:circuits built on-demand only. This was slow.
Predict desired ports based on past behavior. S S S (exit to 80,22) Alice S S S (exit to 8001)
“Cannibalize” unused circuits for faster response to requests no circuit supports. S S S Alice Service on weird port S (exit to weird port)
III. Circuit-building protocol
Extend by IP:Port was insufficient: nodes don't all know each other. “Extend this circuit to S2 at 18.244.0.188:9010” “Uh, how?” S2 S1 Alice In practice, server knowledge is not 100% synchronized. So, use identity key and IP.
Using key-only ID for this created an MITM attack. evil “Extend this circuit to S2 at evil:9010” S2 S1 Alice Only good for traffic analysis... but other users were effective. (So, don’t use only identity key.)
Using encrypted create cell for first hop was needless crypto. E(g^x) g^y,H(K=g^xy) Old S Alice “Uh, guys? This is TLS.” X Y,H(K=H(X|Y)) New S Already encrypted, Alice authenticated
Speaking of cryptography, check for bad values of g^x, g^y. Bad server Client Server 2 E2(g x ) E2(g 0 ) g y, H(g 0y ) g 0, H(g x0 ) “oops.” (but once we checked for bad g^x,g^y,Ian Goldberg could prove (Also, we patched OpenSSL for this.) this protocol secure.)
III. Tools and features
Old Tor:everybody must speak SOCKS. Privoxy/ HTTP SOCKS browser polipo Tor SOCKS gaim TCP ??????? ??????? App
The old solutions kind of sucked. Privoxy/ HTTP SOCKS browser polipo Tor SOCKS gaim S K C O S Replaced Linux/BSD libc App calls On windows, you could do a net driver... OSX was screwed.
TransPort (+iptables/pf) support any TCP Privoxy/ HTTP SOCKS App polipo Tor SOCKS App TCP TCP + Linux, address App BSD or OSX You can also do use a VM as your router: see JanusVM.
Problem: DNS leaks are hard to solve. “Where is naughty.com?” DNS “1.2.3.4!” Dumb Tor SOCKS App “get me 1.2.3.4!”
Old solution: “use SOCKS4a or else!” Smart Tor SOCKS App “get me naughty.com!”
New solution: Tor acts as a DNS server “Where is naughty.com?” DNS “1.2.3.4!” Dumb Tor SOCKS App “get me 1.2.3.4!” This also lets dumb apps handle .onion addresses.
Problem: editing text files is hard. So, add support for external GUIs. Vidalia Tor TorK ....
Things to do: ● Tor: https://torproject.org – Try it out; want to run a server? – See docs and specs for more detail. ● Donate to Tor! – https://torproject.org/donate.html – (We’rea tax-deductible charity!) ● Donate to EFF too! – I’m in the dunk tank at 6:30 ● See more talks! – Roger at 2 on anti-censorship – Mike at 5 on securing the network and apps.
Recommend
More recommend