Server and Threads vanilladb.org Where are we? VanillaCore JDBC - - PowerPoint PPT Presentation

server and threads
SMART_READER_LITE
LIVE PREVIEW

Server and Threads vanilladb.org Where are we? VanillaCore JDBC - - PowerPoint PPT Presentation

Server and Threads vanilladb.org Where are we? VanillaCore JDBC Interface (at Client Side) Remote.JDBC (Client/Server) Server Query Interface Tx Planner Parse Algebra Storage Interface Sql/Util Concurrency Recovery Metadata Index


slide-1
SLIDE 1

Server and Threads

vanilladb.org

slide-2
SLIDE 2

Sql/Util Metadata Concurrency Remote.JDBC (Client/Server) Algebra Record Buffer Recovery Log File Query Interface Storage Interface VanillaCore Parse Server Planner Index Tx JDBC Interface (at Client Side)

Where are we?

2

slide-3
SLIDE 3

Before Diving into the Code…

  • How does this massive code run?
  • How many processes?
  • How many threads?
  • Thread-local or thread-safe components?
  • Any difference between embedded clients and

remote clients?

  • These decisions may influence the software

architecture of an RDBMS and its performance

3

slide-4
SLIDE 4

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

4

slide-5
SLIDE 5

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

5

slide-6
SLIDE 6

What’s difference between a process and a thread?

6

slide-7
SLIDE 7

Process vs. Thread (1/2)

7

slide-8
SLIDE 8

Process vs. Thread (2/2)

  • Process = threads (at least one) + global

resources (e.g., memory space/heap, files, etc.)

  • Thread = a unit of CPU execution + local

resources (e.g., program counter, registers, stack, etc.)

8

slide-9
SLIDE 9

What’s difference between a kernel thread and a user thread?

9

slide-10
SLIDE 10

Kernel Threads

  • Scheduled by OS

– On signel-core machines: – On multi-core machines: – Examples: POSIX Pthreads (UNIX), Win32 threads

10

slide-11
SLIDE 11

User Threads

  • Scheduled by user applications (in user space

above the kernel)

– Lightweight -> faster to create/destroy – Examples: POSIX Pthreads (UNIX), Java threads

  • Eventually mapped to kernel threads

– How?

11

slide-12
SLIDE 12

Many-to-One

  • Pros:

– Simple – Efficient thread mgr.

  • Cons:

– One blocking system call makes all threads halt – Cannot run across multiple CPUs (each kernel thread runs on one CPU)

  • Examples:

– Green threads in Solaris, seldom used in modern OS

12

slide-13
SLIDE 13

One-to-One

  • Pros:

– Avoid the blocking problem

  • Cons:

– Slower thread mgr.

  • Most OSs limit the number of kernel threads

to be mapped for a process

  • Examples: Linux and Windows (from 95)

13

slide-14
SLIDE 14

Many-to-Many

  • Combining the best

features of the one-to-one and many-to-one

  • Allowing more kernel

threads for a heavy user thread

  • Examples: IRIX, HP-UX,

ru64, and Solaris (prior to 9)

– Downgradable to one-to-

  • ne

14

slide-15
SLIDE 15

How about Java threads?

15

slide-16
SLIDE 16

Java Threads

  • Scheduled by JVM
  • Mapping depends on the JVM implementation

– But normally one-to-one mapped to Pthreads/Win32 threads on UNIX/Windows

  • Pros:

– System independent (if there’s a JVM)

16

slide-17
SLIDE 17

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

17

slide-18
SLIDE 18

Why does an RDBMS support concurrent statements/txs?

18

slide-19
SLIDE 19

Serialized or interleaved operations?

19

slide-20
SLIDE 20

Throughput via Pipelining

  • Interleaving ops increases throughput by

pipelining CPU and I/O

20

Tx1 Tx2 R(A) CPU R(A) CPU R(A) W(A) CPU W(B) Tx1 Tx2 R(A) CPU R(A) CPU W(B) R(A) CPU W(A)

=> idle

slide-21
SLIDE 21

Statements run by processes or threads?

21

slide-22
SLIDE 22

Processes vs. Threads

  • Don’t forget resources!

– Files

  • If statements are run by process, then we need

inter-process communications

– When, e.g., two statements access the same table (file) – System dependent

  • Threads allows global resources to be shared

directly

– E.g., through static variables

22

slide-23
SLIDE 23

What Resources We Have?

  • Opened files
  • Buffers (to cache pages)
  • Logs
  • Locks of objects (incl. files/blocks/record locks)
  • Metadata
  • How are they shared in VanillaCore?

23

slide-24
SLIDE 24

Sql/Util Metadata Concurrency Remote.JDBC (Client/Server) Algebra Record Buffer Recovery Log File Query Interface Storage Interface VanillaCore Parse Server Planner Index Tx JDBC Interface (at Client Side)

Architecture of VanillaCore

24

slide-25
SLIDE 25

VanillaDb (1/2)

  • Provides access to

global resources:

– FileMgr, BufferMgr, LogMgr, CatalogMgr

  • Creates the new
  • bjects that access

global resources:

– Planner and Transaction

25

VanillaDb + init(dirName : String) + init(dirName : String, bufferMgrType : BufferMgrType) + isInited() : boolean + initFileMgr(dirname : String) + initFileAndLogMgr(dirname : String) + initFileLogAndBufferMgr(dirname : String, bufferMgrType : BufferMgrType) + initTaskMgr() + initTxMgr() + initCatalogMgr(isnew : boolean, tx : Transaction) + initStatMgr(tx : Transaction) + initSPFactory() + initCheckpointingTask() + fileMgr() : FileMgr + bufferMgr() : BufferMgr + logMgr() : LogMgr + catalogMgr() : CatalogMgr + statMgr() : StatMgr + taskMgr() : TaskMgr + txMgr() : TransactionMgr + spFactory() : StoredProcedureFactory + newPlanner() : Planner + initAndStartProfiler() + stopProfilerAndReport()

slide-26
SLIDE 26

VanillaDb (2/2)

  • Before using the VanillaCore, the

VanillaDb.init(name) must be called

– Initialize file, log, buffer, metadata, and tx mgrs – Create or recover the specified database

26

slide-27
SLIDE 27

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

27

slide-28
SLIDE 28

Embedded Clients

  • Running on the same machine as RDBMS
  • Usually, single-threaded applications

– E.g., sensor nodes, dictionaries, phone apps, etc.

  • If you need high throughput, manage threads

yourself

– Identify causal relationship between statements – Run each group of causal statements in a thread – No causal relationship between the results

  • utputted by different groups

28

slide-29
SLIDE 29

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

29

slide-30
SLIDE 30

Remote Clients

  • The server handles the worker thread creation
  • One worker thread per request
  • Clients can still creates multiple client threads

– E.g., web/application servers

30

server/dispatcher thread worker threads client threads

slide-31
SLIDE 31

What is a request?

  • An I/O operation?
  • A statement?
  • A transaction?
  • A connection?

31

slide-32
SLIDE 32

Request = Connection

  • In VanillaDB, a worker thread handles all

statements issued by the same user

  • Rationale:

– Statements issued by a user are usually in a causal

  • rder -> ensure casualty in a session

– A user may re-examine the data he/shed accessed -> easier caching

  • Implications:

– All statements issued in a JDBC connection is run by a single thread at server – #connections = #threads

32

slide-33
SLIDE 33

Thread Pooling

  • Creating/destroying a thread each time upon

connection/disconnection leads to large

  • verhead
  • To reduce this overhead, a worker thread pool is

commonly used

– Threads are allocated from the pool as needed, and returned to the pool when no longer needed – When no threads are available in the pool, the client may have to wait until one becomes available

  • So what?
  • Graceful performance degradation by limiting the

pool size

33

slide-34
SLIDE 34

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

34

slide-35
SLIDE 35

JDBC Programming

  • 1. Connect to the server
  • 2. Execute the desired query
  • 3. Loop through the result set (for SELECT only)
  • 4. Close the connection
  • A result set ties up valuable resources on the server,

such as buffers and locks

  • Client should close its connection as soon as the

database is no longer needed

35

slide-36
SLIDE 36

java.sql (1/2)

  • Makes connections

to the server

36

<<interface>> Driver + connect(url : String, info : Properties) : Connection <<interface>> Connection + createStatement() : Statement + close() + setAutoCommit(autoCommit : boolean) + setReadOnly(readOnly : boolean) + setTransactionIsolation(level : int) + getAutoCommit() : boolean + getTransactionIsolation() : int + commit() + rollback()

slide-37
SLIDE 37

java.sql (2/2)

37

  • An iterator of output

records

<<interface>> Statement + executeQuery(gry : String) : ResultSet + executeUpdate(cmd : String) : int ...

<<interface>> ResultSet + next() : boolean + getInt(fldname : String) : int + getString(fldname : String) : String + getLong(fldname : String) : Long + getDouble(fldname : String) : Double + getMetaData() : ResultSetMetaData + beforeFirst() + close() ... <<interface>> ResultSetMetaData + getColumnCount() : int + getColumnName(column : int) : String + getColumnType(column : int) : int + getColumnDisplaySize(column : int) : int ...

slide-38
SLIDE 38

Implementing JDBC in VanillaCore

  • JDBC API is defined at client side
  • Needs both client- and server-side implementations

– In org.vanilladb.core.remote.jdbc package – JdbcXxx are client-side classes – RemoteXxx are server-side classes

  • Based on Java RMI

– Handles server threading: dispatcher thread, worker threads, and thread pool – But no control to pool size – Synchronizes a client thread with a worker thread

  • Blocking method calls at clients

38

slide-39
SLIDE 39

Sql/Util Metadata Concurrency Remote.JDBC (Client/Server) Algebra Record Buffer Recovery Log File Query Interface Storage Interface VanillaCore Parse Server Planner Index Tx JDBC Interface (at Client Side)

Architecture of VanillaCore

39

slide-40
SLIDE 40

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

40

slide-41
SLIDE 41

Java RMI

  • Java RMI allows methods of an object at

server VM to be invoked remotely at a client VM

– We call this object a remote object

  • How?

41

slide-42
SLIDE 42

The Stub and Skeleton

1. The skeleton (run by a server thread) binds the interface of the remote object 2. A client thread looks up and obtain a stub of the skeleton 3. When a client thread invokes a method, it is blocked and the call is first forwarded to the stub 4. The stub marshals the parameters and sends the call to the skeleton through the network 5. The skeleton receives the call, unmarshals the parameters, allocates from pool a worker thread that runs the remote object’s method on behalf of the client 6. When the method returns, the worker thread returns the result to skeleton and returns to pool 7. The skeleton marshals the results and send it to stub 8. The stub unmarshals the results and continues the client thread

42

Stub

RMI Client RMI Server

skeleton return call

slide-43
SLIDE 43

RMI registry

  • The server must first

bind the remote obj’s interface to the registry with a name

– The interface must extend the java.rml.Remote interface

  • The client lookup the

name in the registry to

  • btain a stub

RMI Server skeleton stub RMI Client Registry bind lookup return call Cilent-side Machine Server-side Machine

43

slide-44
SLIDE 44

Things to Note

  • A client thread and a worker thread is synchronized
  • The same remote object is run by multiple worker

threads (each per client)

– Remote objects bound to registry must be thread-safe

  • If the return of a remote method is another remote
  • bject, the stub of that object is created automatically

and sent back to the client

– That object can be either thread-local or thread-safe, depending on whether it is created or reused during each method call

  • A remote object will not be garbage collected if there’s

a client holding its stub

– Destroy stub (e.g., closing connection) at client side ASAP

44

slide-45
SLIDE 45

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

45

slide-46
SLIDE 46

Server-Side JDBC Impl.

  • RemoteXxx classes that mirror their

corresponding JDBC interfaces at client-side

– Implement the most essential JDBC methods only

  • Interfaces: RemoteDriver,

RemoteConnection, RemoteStatement, RemoteResultSet and RemoteMetaData

– To be bound to registry – Extend java.rml.Remote – Throw RemoteException instead of SQLException

46

slide-47
SLIDE 47

RemoteDriver

  • Corresponds to the JDBC Driver interface

47 <<interface>> RemoteDriver + connect() : RemoteConnection RemoteDriverImpl + RemoteDriverImpl() + connect() : RemoteConnection

slide-48
SLIDE 48

RemoteConnection

  • Corresponds to JDBC Connection interface

48 <<interface>> RemoteConnection + createStatement() : RemoteStatement + close() + setAutoCommit(autoCommit : boolean) + setReadOnly(readOnly : boolean) + setTransactionIsolation(level : int) + getAutoCommit() : boolean + isReadOnly() : boolean + getTransactionIsolation() : int + commit() + rollback() RemoteConnectionImpl ~ RemoteConnectionImpl() + createStatement() : RemoteStatement + close() + setAutoCommit(autoCommit : boolean) + setReadOnly(readOnly : boolean) + setTransactionIsolation(level : int) + getAutoCommit() : boolean + isReadOnly() : boolean + getTransactionIsolation() : int + commit() + rollback() ~ getTransaction() : Transaction ~ endStatement()

slide-49
SLIDE 49

RemoteStatement

  • Corresponds to JDBC Statement interface

49 <<interface>> RemoteStatement + executeQuery(qry : String) : RemoteResultSet + executeUpdate(cmd : String) : int RemoteStatementImpl + RemoteStatementImpl(rconn : RemoteConnectionImpl) + executeQuery(qry : String) : RemoteResultSet + executeUpdate(cmd : String) : int

slide-50
SLIDE 50

RemoteResultSet

  • Corresponds to JDBC ResultSet interface

50 RemoteResultSetImpl + RemoteResultSetImpl(plan : Plan, rconn : RemoteConnectionImpl) + next() : boolean + getInt(fldname : String) : int + getLong(fldname : String) : long + getDouble(fldname : String) : double + getString(fldname : String) : String + getMetaData() : RemoteMetaData + beforeFirst() + close() <<interface>> RemoteResultSet + next() : boolean + getInt(fldname : String) : int + getLong(fldname : String) : long + getDouble(fldname : String) : double + getString(fldname : String) : String + getMetaData() : RemoteMetaData + beforeFirst() + close()

slide-51
SLIDE 51

RemoteMetaData

  • Corresponds to JDBC ResultSetMetaData

interface

51 <<interface>> RemoteMetaData + getColumnCount() : int + getColumnName(column : int) : String + getColumnType(column : int) : int + getColumnDisplaySize(column : int) : int RemoteMetaDataImpl + RemoteMetaDataImpl(sch : Schema) + getColumnCount() : int + getColumnName(column : int) : String + getColumnType(column : int) : int + getColumnDisplaySize(column : int) : int

slide-52
SLIDE 52

Registering Remote Objects

  • Only the RemoteDriver need to be bound to

registry

– Stubs of others can be obtained by method returns

  • Done by JdbcStartUp:

/* create a registry specific for the server on the default port 1099 */ Registry reg = LocateRegistry.createRegistry(1099); // post the server entry in it RemoteDriver d = new RemoteDriverImpl(); /* create a stub for the remote implementation object d, save it in the RMI registry */ reg.rebind("vanilladb-jdbc", d);

52

slide-53
SLIDE 53

Obtaining Stubs

  • To obtain the stubs at client-side:
  • Directly through registry or indirectly through

method returns

53

// url = "jdbc:vanilladb://xxx.xxx.xxx.xxx:1099" String host = url.replace("jdbc:vanilladb://", ""); Registry reg = LocateRegistry.getRegistry(host); RemoteDriver rdvr = (RemoteDriver) reg.lookup("vanilladb-jdbc"); // creates connection RemoteConnection rconn = rdvr.connect(); // creates statement RemoteStatement rstmt = rconn.createStatement();

slide-54
SLIDE 54

JDBC Client-Side Impl.

  • Implement java.sql interfaces using the

client-side wrappers of stubs

– E.g., JdbcDriver wraps the stub of RemoteDriver

54

<<interface>> java.sql.Driver + connect(url : String, info : Properties) : Connection + acceptsURL(url : String) : boolean + getMajorVersion() : int + getMinorVersion() : int + getPropertyInfo(url : String, info : Properties) : DriverPropertyInfo[] + jdbcCompliant() : boolean <<abstract>> DriverAdapter // throws exceptions for unimplemented methods JdbcDriver + connect(url : String, prop : Properties) : Connection

slide-55
SLIDE 55

DriverAdapter and JdbcDriver

55

  • DriverAdapter
  • Dummy impl. of the Driver interface (by throwing exceptions)
  • JdbcDriver:

public class JdbcDriver extends DriverAdapter { public Connection connect(String url, Properties prop) throws SQLException { try { // assumes no port specified String host = url.replace("jdbc:vanilladb://", ""); Registry reg = LocateRegistry.getRegistry(host); RemoteDriver rdvr = (RemoteDriver) reg.lookup("vanilladb-jdbc"); RemoteConnection rconn = rdvr.connect(); return new JdbcConnection(rconn); } catch (Exception e) { throw new SQLException(e); } } }

slide-56
SLIDE 56

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

56

slide-57
SLIDE 57

Remote Class Implementation in RMI Layers

TCP

Remote Reference Layer

Transport Layer Java Virtual Machine Client Object

Remote Reference Layer

Transport Layer Java Virtual Machine Stub Remote Object Skeleton

57

slide-58
SLIDE 58

RemoteDriverImpl

  • RemoteDriverImpl is the entry point into the

server

  • Each time its connect method is called (via the stub), it

creates a new RemoteConnectionImpl on the server

– RMI creates the corresponding stub and returns back it to the client

  • Run by multiple threads, must be thread-safe

58 <<interface>> RemoteDriver + connect() : RemoteConnection RemoteDriverImpl + RemoteDriverImpl() + connect() : RemoteConnection

slide-59
SLIDE 59

RemoteConnectionImpl

  • Manages client connections on the server

– Associated with a tx – commit() commits the current tx and starts a new one immediately

  • Thread local

59

<<interface>> RemoteConnection + createStatement() : RemoteStatement + close() + setAutoCommit(autoCommit : boolean) + setReadOnly(readOnly : boolean) + setTransactionIsolation(level : int) + getAutoCommit() : boolean + isReadOnly() : boolean + getTransactionIsolation() : int + commit() + rollback() RemoteConnectionImpl ~ RemoteConnectionImpl() + createStatement() : RemoteStatement + close() + setAutoCommit(autoCommit : boolean) + setReadOnly(readOnly : boolean) + setTransactionIsolation(level : int) + getAutoCommit() : boolean + isReadOnly() : boolean + getTransactionIsolation() : int + commit() + rollback() ~ getTransaction() : Transaction ~ endStatement()

slide-60
SLIDE 60

RemoteStatementImpl

  • Executes SQL statements

– Creates a planner that finds the best plan tree

  • If the connection is set to be auto commit, the

executeUpdate() method will call connection.commit() in the end

  • Thread local

60 <<interface>> RemoteStatement + executeQuery(qry : String) : RemoteResultSet + executeUpdate(cmd : String) : int RemoteStatementImpl + RemoteStatementImpl(rconn : RemoteConnectionImpl) + executeQuery(qry : String) : RemoteResultSet + executeUpdate(cmd : String) : int

slide-61
SLIDE 61

RemoteResultSetImpl

  • Provides methods for iterating the output records

– The scan opened from the best plan tree

  • Tx spans through the iteration

– Avoid doing heavy jobs during the iteration

  • Thread local

61 RemoteResultSetImpl + RemoteResultSetImpl(plan : Plan, rconn : RemoteConnectionImpl) + next() : boolean + getInt(fldname : String) : int + getLong(fldname : String) : long + getDouble(fldname : String) : double + getString(fldname : String) : String + getMetaData() : RemoteMetaData + beforeFirst() + close() <<interface>> RemoteResultSet + next() : boolean + getInt(fldname : String) : int + getLong(fldname : String) : long + getDouble(fldname : String) : double + getString(fldname : String) : String + getMetaData() : RemoteMetaData + beforeFirst() + close()

slide-62
SLIDE 62

RemoteMetaDataImpl

  • Provides the schema information about the query

results

– Contains the Schema object of the output table

  • Thread local

62 <<interface>> RemoteMetaData + getColumnCount() : int + getColumnName(column : int) : String + getColumnType(column : int) : int + getColumnDisplaySize(column : int) : int RemoteMetaDataImpl + RemoteMetaDataImpl(sch : Schema) + getColumnCount() : int + getColumnName(column : int) : String + getColumnType(column : int) : int + getColumnDisplaySize(column : int) : int

slide-63
SLIDE 63

Outline

  • Processes, threads, and resource management

– Processes and threads – VanillaDB – Embedded clients – Remote clients

  • Implementing JDBC

– RMI – Remote Interfaces and client-side wrappers – Remote Implementations – StartUp

63

slide-64
SLIDE 64

Staring Up

  • StartUp provides main() that runs

VanillaCore as a JDBC server

– Calls VanillaDB.init()

  • Sharing global resources through static variables

– Binds RemoteDriver to RMI registry

  • Thread per connction
  • Generally

– Classes in the query engine are thread-local – Classes in the storage engine are thread-safe

64

slide-65
SLIDE 65

Assignment Reading

  • The following packages in VanillaCore

– org.vanilladb.core.server – org.vanilladb.core.remote.jdbc

65

slide-66
SLIDE 66

References

  • Java Threads and Concurrency
  • Java RMI

66