meant for MCA II YEAR STUDENTS TELANGANA UNIVERSITY NIZAMABAD
Socket API
Socket API originated with the 4.2 BSD system released in 1983
Sockets – A way to speak to other programs using UNIX file descriptors.
A file descriptor is an integer associated with an open file.This can be a network connection
Kinds of Sockets-DARPA Internet addresses(Internet Sockets) , Unix Sockets, X.25 Sockets etc
Types of Internet Sockets
SOCK_STREAM uses TCP (Transmission Control Protocol) Connection oriented and Reliable
SOCK_DGRAM uses UDP (User Datagram Protocol)
Connectionless and Unreliable
Structs and Data Handling
A socket descriptor is of type int
Byte ordering
Most significant byte first – Network byte order (Big Endian)
Least significant byte first – Host Byte order ( Little ..)
Socket Structures in Network byte order
struct sockaddr { unsigned short sa_family; // address family, AF_xxx char sa_data[14]; // 14 bytes of protocol address };
struct sockaddr_in { short int sin_family; // Address family
unsigned short int sin_port; // Port number
struct in_addr sin_addr; // Internet address
unsigned char sin_zero[8]; // Same size as struct sockaddr };
Convert the Natives
struct in_addr { unsigned long s_addr; // 32-bit long, or 4 bytes };
If ina is of type struct sockaddr_in
ina.sin_addr.s_addr references the 4-byte IP address (in Network Byte Order
htons() – Host to Network Short
htonl() -- "Host to Network Long"
ntohs() -- "Network to Host Short"
ntohl() -- "Network to Host Long"
IP Addresses
socket01.utdallas.edu 129.110.43.11
sol2.utdallas.edu 129.110.34.2 etc
Other UTD machines for use socket02 – socket06 , sol1 , jupiter
Please do not use apache for Network programming
inet_addr() converts an IP address in numbers-and-dots notation into unsigned long
ina.sin_addr.s_addr = inet_addr(“129.110.43.11”) // Network byte order
Also can use inet_aton() -- “ascii to network”
int inet_aton(const char *cp,struct in_addr *inp);
inet_ntoa returns a string from a struct of type in_addr
inet_ntoa(ina.sin_addr) ;
Useful UNIX Commands
netstat –i prints information about the interfaces
netstat –ni prints this information using numeric addresses
loop back interface is called lo and the ethernet interface is called eth0 or le0 depending on the machine
netstat –r prints the routing table
netstat | grep PORT_NO shows the state of the client socket
ifconfig eth0 – Given the interface name ifconfig gives the details for each interface --- Ethernet Addr , inet_addr , Bcast , Mask , MTU
ping IP_addr -- Sends a packet to the host specified by IP_addr and prints out the roundtrip time ( Uses ICMP messages)
traceroute IP_addr -- Shows the path from this host to the destination printing out the roundtrip time for a packet to each hop in between
Tcpdump communicates directly with Data Link layer UDP Packet fail
System Calls
socket() – returns a socket descriptor
int socket(int domain, int type, int protocol);
bind() – What port I am on / what port to attach to
int bind(int sockfd, struct sockaddr *my_addr, int addrlen);
connect() – Connect to a remote host
int connect(int sockfd, struct sockaddr *serv_addr, int addrlen);
listen() – Waiting for someone to connect to my port
int listen(int sockfd, int backlog);
accept() – Get a file descriptor for a incomming connection
int accept(int sockfd, void *addr, int *addrlen);
send() and recv() – Send and receive data over a connection
int send(int sockfd, const void *msg, int len, int flags);
int recv(int sockfd, void *buf, int len, unsigned int flags);
sendto() and recvfrom() – Send and receive data without connection
int sendto(int sockfd, const void *msg, int len, unsigned int flags, const struct sockaddr *to, int tolen);
int recvfrom(int sockfd, void *buf, int len, unsigned int flags, struct sockaddr *from, int *fromlen);
close() and shutdown() – Close a connection Two way / One way
getpeername() – Obtain the peer name given the socket file descriptor
gethostname() – My computer name
int sock_get_port(const struct sockaddr *sockaddr,socklen_t addrlen);
Useful to get the port number given a struct of type sockaddr
Readn() writen() readline() Read / Write a particular number of bytes
Fork() – To start a new process with parents addr space
Exec() Load a new program on callers addr space
Issues in Client Programming
Identifying the Server.
Looking up a IP address.
Looking up a well known port name.
Specifying a local IP address.
UDP client design.
TCP client design.
Identifying the Server
Options:
hard-coded into the client program.
require that the user identify the server.
read from a configuration file.
use a separate protocol/network service to lookup the identity of the server.
Identifying a TCP/IP server.
Need an IP address, protocol and port.
We often use host names instead of IP addresses.
usually the protocol (UDP vs. TCP) is not specified by the user.
often the port is not specified by the user.
Services and Ports
Many services are available via “well known” addresses (names).
There is a mapping of service names to port numbers:
struct *servent getservbyname( char *service, char *protocol );
servent->s_port is the port number in network byte order.
Specifying a Local Address
When a client creates and binds a socket it must specify a local port and IP address.
Typically a client doesn’t care what port it is on:
haddr->port = htons(0);
Local IP address
A client can also ask the operating system to take care of specifying the local IP address:
haddr->sin_addr.s_addr=
htonl(INADDR_ANY);
UDP Client Design
Establish server address (IP and port).
Allocate a socket.
Specify that any valid local port and IP address can be used.
Communicate with server (send, recv)
Close the socket.
Connected mode UDP
A UDP client can call connect() to establish the address of the server.
The UDP client can then use read() and write() or send() and recv().
A UDP client using a connected mode socket can only talk to one server (using the connected-mode socket).
TCP Client Design
Establish server address (IP and port).
Allocate a socket.
Specify that any valid local port and IP address can be used.
Call connect()
Communicate with server (read,write).
Close the connection.
Closing a TCP socket
Many TCP based application protocols support multiple requests and/or variable length requests over a single TCP connection.
How does the server known when the client is done (and it is OK to close the socket) ?
Partial Close
One solution is for the client to shut down only it’s writing end of the socket.
The shutdown() system call provides this function.
shutdown( int s, int direction);
direction can be 0 to close the reading end or 1 to close the writing end.
shutdown sends info to the other process!
TCP sockets programming
Common problem areas:
null termination of strings.
reads don’t correspond to writes.
synchronization (including close()).
ambiguous protocol.
TCP Reads
Each call to read() on a TCP socket returns any available data (up to a maximum).
TCP buffers data at both ends of the connection.
You must be prepared to accept data 1 byte at a time from a TCP socket!
Server Design
Concurrent vs. Iterative
An iterative server handles a single client request at one time.
A concurrent server can handle multiple client requests at one time.
Concurrent vs. Iterative
Connectionless vs.Connection-Oriented
Statelessness
State: Information that a server maintains about the status of ongoing client interactions.
Connectionless servers that keep state information must be designed carefully!
The Dangers of Statefullness
Clients can go down at any time.
Client hosts can reboot many times.
The network can lose messages.
The network can duplicate messages.
Concurrent ServerDesign Alternatives
One child per client
Spawn one thread per client
Preforking multiple processes
Prethreaded Server
One child per client
Traditional Unix server:
TCP: after call to accept(), call fork().
UDP: after readfrom(), call fork().
Each process needs only a few sockets.
Small requests can be serviced in a small amount of time.
Parent process needs to clean up after children!!!! (call wait() ).
One thread per client
Almost like using fork() - just call pthread_create instead.
Using threads makes it easier (less overhead) to have sibling processes share information.
Sharing information must be done carefully (use pthread_mutex)
Prefork()’d Server
Creating a new process for each client is expensive.
We can create a bunch of processes, each of which can take care of a client.
Each child process is an iterative server.
Prefork()’d TCP Server
Initial process creates socket and binds to well known address.
Process now calls fork() a bunch of times.
All children call accept().
The next incoming connection will be handed to one child.
Preforking
As the book shows, having too many preforked children can be bad.
Using dynamic process allocation instead of a hard-coded number of children can avoid problems.
The parent process just manages the children, doesn’t worry about clients.
Sockets library vs. system call
A preforked TCP server won’t usually work the way we want if sockets is not part of the kernel:
calling accept() is a library call, not an atomic operation.
We can get around this by making sure only one child calls accept() at a time using some locking scheme.
Prethreaded Server
Same benefits as preforking.
Can also have the main thread do all the calls to accept() and hand off each client to an existing thread.
What’s the best server design for my application?
Many factors:
expected number of simultaneous clients.
Transaction size (time to compute or lookup the answer)
Variability in transaction size.
Available system resources (perhaps what resources can be required in order to run the service).
Server Design
It is important to understand the issues and options.
Knowledge of queuing theory can be a big help.
You might need to test a few alternatives to determine the best design.