Bill Gatliff.com billgatliff.com Home
Articles

Web by Proxy

Using an HTTP Proxy to bridge legacy embedded systems to the Internet.

Adding a TCP/IP and web server stack to an embedded system is an expensive proposition. If your product already communicates, it may be better to use a proxy.

This article originally appeared in the May 2000 issue (http://www.embedded.com/internet/0005/0005ia1.htm) of Embedded Systems Programming Magazine (http://www.embedded.com).

It goes without saying that Internet-enabled devices are all the rage these days. A few short years ago, the only mainstream embedded users of the Internet were set-top boxes and network infrastructure equipment. Today, on the other hand, everybody wants to interact with every gadget they own via a web browser, and most of them can provide rational reasons for doing so.

But to most embedded devices, the Internet doesn't come easy. What do you do when your company's main product doesn't have a network port? How can critical applications like industrial controllers be placed online, without disrupting their primary functions? And what about the hordes of existing devices that don't have the resources to support TCP/IP and other Internet protocols? In some of these situations, an HTTP proxy may come to the rescue.

1. Copyright

This article is Copyright ©2002 by Bill Gatliff. All rights reserved. Reproduction for personal use is encouraged as long as the document is reproduced in its entirety, including this copyright notice and author contact information. For other uses, contact the author.

2. About the Author

Bill Gatliff is a freelance embedded developer and training consultant with almost ten years of experience of using GNU and other tools for building embedded systems. His product background includes automotive, industrial, aerospace and medical instrumentation applications.

Bill specializes GNU-based embedded development, and in using and adapting GNU tools to meet the needs of difficult development problems. He welcomes the opportunity to participate in projects of all types.

Bill is a Contributing Editor for Embedded Systems Programming Magazine, a member of the Advisory Panel for the Embedded Systems Conference, maintainer of the Crossgcc FAQ, creator of the gdbstubs project, and a noted author and speaker.

Bill welcomes feedback and suggestions. Contact information is on his website, at www.billgatliff.com.

3. What is a proxy?

Simply put, in a networking context a proxy is any program that provides a communications bridge that other applications can use to exchange data. Proxies are widely used to help protect applications from each other, as in the case of a network firewall1. Our situation, however, illustrates another popular use for proxies: as translators between applications with seemingly incompatible communications strategies. Such a proxy can bring the Internet to an embedded system, while allowing the embedded target to speak its native tongue.

Proxy implementations come in a variety of shapes and sizes, which makes them difficult to present comprehensively in a single article. The fundamental concepts are the same in almost all cases, however, so even the relatively limited treatment I provide here will be useful in a more general setting.

For the remainder of this article, I will assume that the motivation for web-enabling a legacy embedded device is to allow a customer to interact with the product using an ordinary web browser. This assumption allows us to focus on a single kind of proxy, one that can translate between HTTP and the target device's own, proprietary protocol.

4. What is an HTTP proxy?

An HTTP proxy, as I present it here, is a program that implements a browser's HTTP requests for data using one or more proprietary message exchanges with the target embedded device. Once this exchange is complete, the proxy returns the result to the client as an HTML document, or some other kind of browser-friendly format like PNG, JPG, or even raw ASCII text.

The proxy executable is placed at the most convenient point between the client and the target, depending on the desired capabilities of the overall solution. In most cases, the best location for the proxy is on the PC running the browser, especially the case when the target doesn't support Ethernet, or access is needed only when the client is standing next to the product. When something resembling true Internet-wide connectivity is necessary, however, the proxy can be installed on an inexpensive, single-board computer located between the target and the target's link to the network.

Note: A picture of this would be useful here, wouldn't it?

5. Why a proxy?

The traditional approach to putting a device "on the Internet" is to add TCP/IP and various other capabilities to the target itself. While this approach has its advantages, it is usually an unreasonable option for mature embedded products--- particularly those that lack the necessary hardware interfaces, spare memory, or processor cycles.

Proxies enable Internet-style communications with legacy hardware without modification of the target application (of course, the target must support some kind of communications capability beforehand). The proxy application runs on a computer located somewhere between the client's browser and the target, and uses the target's native tongue to extract information to send back to the web browser. As a result, the target device has no idea that it has been Internet-enabled.

A proxy-based solution is more flexible than an embedded system that speaks IP directly. Because it doesn't need to physically coexist with the target application, a proxy can support the overhead necessary to present a uniform user interface for multiple target versions. In addition, the target device's visual interface, as shown on the client's browser, can be changed without taking the target system out of service simply by upgrading the proxy application.

A proxy also permits communication with targets that don't offer a connection medium normally associated with IP protocols. For example, an HTTP-to-CAN proxy could be used to provide browser access to a target that had only a CAN port.

6. HTTP 101

Obviously, an understanding of how web browsers communicate is needed before we can use a browser to interact with an embedded target via its HTTP proxy. I'm not going to try to train you for a new career as a web server designer in this section, but I will try to cover all the basics.

Contrary to popular belief, your web browser's primary language is actually HTTP, not HTML. When you type in a URL like http://www.embedded.com/index.html, for example, your browser sends the following HTTP message to the web server on the machine named www.embedded.com:2

GET /index.html

A typical web server's response to this message is to return the contents of an HTML file named index.html, but this isn't always the case. In fact, the particulars of the response are left entirely to the server, and sophisticated ones like Zope (http://www.zope.org/) routinely break the conventional notion of a one-to-one mapping between URLs and file names on the serving machine (Zope has good reasons for doing so).

Moving on, when you fill in some text and then click on a button in an HTML form (the home page for an Internet search engine, for example), your browser sends a slightly different HTTP message to the server:

GET /query?textfield=textdata&pressme=press_here

This message tells the server that you typed the word textdata into a field called textfield, and then clicked button pressme (which was showing the text press_here at the time) in an HTML form called query. As with the previous message, what happens next is entirely up to the server. Often the result is that the web server passes the message to a standalone application that performs a host-specific function (a database lookup, for example), and then returns HTML to the browser.

The HTTP protocol contains several other messages, including ones for PUTting and POSTing data. We don't need to consider those for our simple proxy, however, so in the interest of space I'll include references at the end of this article for further reading.

7. A basic example

With a proxy-based solution, the key to connecting an embedded device to a browser lies in the ability to translate between HTTP and whatever language and media the target system supports. To illustrate one way to do this, I have developed a very basic HTTP proxy. To use this code, you must enhance the included parse_http_request() function to decode an HTTP message in a manner most suited to your needs, and then use the information the message contains to decide what to do next. The code is shown in Figure 1.

Figure 1. A simple "home page" for your product.

int parse_http_request ( int connfd,
char *http_request_buf )
{
http_request_T http_request;

http_request.method = strtok( http_request_buf, " " );
if( strcmp( http_request.method, "GET" ) == 0 ) {

http_request.object = strtok( 0, " " );
http_request.protocol = strtok( 0, " \r\n" );

if( strcmp( http_request.object, "/" ) == 0 )
home_page( connfd );

else if( strcmp( http_request.object,
"/query?pressme=more_info" ) == 0 )
more_info_page( connfd );

else
error_page( connfd );
}

return 0;
}

For example, let's say that all you want to do is provide a simple "home page" for your product that shows calendar time at the target device. To do this, you don't need to look at the arguments supplied with the HTTP request at all, because the response will be the same in all cases. Figure 1 shows how to do this, assuming you can use a function called proprietary_localtime() to get time information from the target.

To see this example in action, simply compile the example code, launch the resulting executable, and then supply the following URL to your browser:

http://localhost/

If your workstation already has a web server installed, try changing the definition of LISTENPORT in the example code to an unused port number (for example, 8000). Recompile, then connect using this URL instead:3

http://localhost:8000/

In any case, here is what the code in Figure 1 does:

  • Provides an initial "okay" response to the client's browser.
  • Calls the function that gets the local time from the target,
  • Builds an HTML page that contains the response, and,
  • Sends that response back to the client's browser.

8. A more sophisticated example

Let's now suppose that we want the target's home page to contain a button that the user can click to get more information from the target. This requires more intelligence in parse_http_request(), because we have to:

  • Send the client the home page with the button, and,
  • Determine which button the user pressed and respond accordingly.

This code is shown in Figure 2. It's doing the same thing as in the previous example, except that it is choosing which page to return based on whether the HTTP message says that the user clicked on the button labeled value or not.

Figure 2. A parser for a two-page proxy.

typedef struct
{
char *method;
char *object;
} http_request_T;

int
parse_http_request(char *http_request, int connfd)
{
http_request.method = strtok( http_request, " " );
http_request.object = strtok( 0, " " );

if ( strcmp( http_request.object, "/query?press=value" ) == 0 )
send_other_page( connfd );
else
send_home_page( connfd );
}

9. Hidden values and proxy simplification

The previous examples are straightforward, but they won't scale very well to applications with more than a handful of pages. The reason is that parse_http_request() requires specific parsing code for each page, something that quickly becomes tedious and error-prone for anything beyond the simplest functionality.

When an application with many pages is desired, HTML's hidden values are the preferred way to manage the complexity without the drudgery of lots and lots of parsing code.

Figure 3 shows the code for an HTML page with two forms, each containing a unique hidden value and a single button. When the user clicks one of the buttons, the browser includes the associated hidden value in the HTTP message, which makes it a convenient way for the proxy to determine what to do next. TODO: this link is wrong--- it should point to some HTML code.

Figure 3. A home page with hidden values.

void
home_state ( int connfd, char *http_object )
{
char cbuf[1024];

/* this looks ugly, but we're just brute-force building
an html page that includes a proprietary callout and
a couple of choices for the user */
sprintf( cbuf,
"<hr><h1>Welcome to the web-legacy proxy!</h1><hr>"
"Local time is: %s"
"<form method=\"get\" action=\"help\">"
"<input type=\"text\" name=\"textfield\" "
"value=\"type something here\" size=40>"
"<input type=submit name=\"pressme\" value=\"Help!\">"
"<input type=hidden name=\"state_id\" value=\"%d\">"
"</form><form method=\"get\" action=\"query\">"
"<input type=submit name=\"pressme\" value=\"Conclusions\">"
"<input type=hidden name=\"state_id\" value=\"%d\">"
"</form>", proprietary_localtime(), PAGE2, PAGE3 );

/* send the page to the browser */
send( connfd, standard_http_header, strlen( standard_http_header ), 0 );
send( connfd, cbuf, strlen( cbuf ), 0 );
send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return;
}

When the user clicks on Help!, the browser sends state_id=1234 to the proxy. Likewise, when the user clicks Conclusions, the browser sends state_id=5678. The code in Figure 4 extracts the value of state_id from the HTTP message, and then looks up and invokes the state's associated function to generate the proper response.

Figure 4. Parsing a page with a hidden value.

typedef struct {
int id;
void (*state)( int connfd, char *http_object );
} http_state_T;

const http_state_T http_states[] = {
{ HOME, home_state },
{ PAGE2, page_2 },
{ PAGE3, page_3 },
{ 0, 0 }
};

void parse_http_request ( int connfd,
char *http_request_buf )
{
http_request_T http_request;
char *state_idstr;
int state_id;
int wstate;

/* make sure it's a "GET" message; if it isn't,
we don't know what to do with it */
http_request.method = strtok( http_request_buf, " " );
if( strcmp( http_request.method, "GET" ) == 0 ) {

/* crack apart the rest of the request */
http_request.object = strtok( 0, " " );
http_request.protocol = strtok( 0, " \r\n" );

/* find the "state_id=" portion of the message */
state_idstr = strstr( http_request.object, "state_id=" );
if( state_idstr ) {

/* get the number that follows "state_id=" */
state_idstr = strchr( state_idstr, '=' ) + 1;
sscanf( state_idstr, "%d", &state_id );

/* look it up */
for( wstate = 0;
http_states[wstate].id;
wstate++ ) {

/* found it! invoke the state function */
if( http_states[wstate].id == state_id ) {
http_states[wstate].state( connfd, http_request.object );
break;
}
}

if( http_states[wstate].id == 0 )
error_state( connfd );
}

/* there wasn't a "state_id=" in the message;
default to the home page */
else home_state( connfd, 0 );
}

return;
}

To add a new page to the application just add its associated state_id value and function to http_states[], and then adjust the contents of the referring page to deliver this value to the proxy at the proper time (when the user clicks on a button, for example). In other words, you no longer need to modify parse_http_request() when a page is added.

The http_states[] table is a kind of "site map" for the entire application that parse_http_request() uses to move the client through pages in the appropriate order. From another perspective, http_states[] is a state machine that drives the behavior of the proxy in response to user events encoded in state_id values. Whatever your interpretation, it should be clear that a state-driven proxy architecture makes it far easier to manage applications with a lot of pages than anything else I've shown you so far.

10. But can't I do all of this with CGI?

Yes and no. The examples shown here include portions of web server functionality that most CGI applications don't have, in particular the ability to receive HTTP requests from an IP port via bind(), accept(), and read(). As such, our proxies can run on machines that lack a web server, which would be the case for most of your client's PCs.

On the other hand, a CGI-based approach makes sense when you need a proxy that can run on different kinds of hosts, or there is the possibility that the proxy will run on a host that is already running a web server. In the event that you produce a CGI proxy but a web server isn't available, the example proxy in this article can serve as a minimal web server that forwards HTTP to the proxy via an exec or similar system call.

A CGI-style proxy also has the advantage of being able to replace calls to write() et al with printf() and family, because the application's standard input and output streams are automatically bound to the TCP/IP socket used by the browser. There is no reason that the examples shown in this article could not be modified to do the same thing, by modifying the stdio file descriptors in main() before calling parse_html_request().

11. Disadvantages of proxies

HTTP proxies are a simple and powerful way to get a legacy product onto the Internet, but they do have their limitations. To begin with, the proxy must be properly installed and running somewhere before communication with the target system is possible. In contrast, for targets with integrated Ethernet and HTTP/TCP/IP capabilities, the user only needs to plug in a cable and type in a URL.

A standalone proxy also does nothing to assure that the target interfaces it uses are properly maintained. A compiled-in HTTP server, in contrast, will likely produce compilation or link errors if a function it needs is accidentally removed from a new version of the target's application.

Finally, successful proxies require some knowledge of the host's networking and other APIs, which may present problems for developers with no skills in this area. I consider this an item of minimal concern, however, given the number of excellent TCP/IP and other networking books available in the mainstream press today.

12. Flexibility and frugality

When you need to get a legacy system talking to the Internet, a proxy is probably the best way to go about it. In addition to their simplicity, proxies offer flexibility and frugality that's tough to match using any other approach.

HTTP proxies are not difficult to implement, and they don't require modification of target software. As a result, the Internet appliances your customers want tomorrow could very well be the devices you are already building today.

13. Resources

Gundavaram, Shishir. CGI Programming on the World Wide Web. This book is only available on-line now, at www.oreilly.com/openbook/cgi.

Guelich, Scott, Shishir Gundavaram, and Gunther, Birznieks. CGI Programming with Perl, 2nd ed. This book will be published by O'Reilly Associates in July 2000.

Stevens, W. Richard. Unix Network Programming. Upper Saddle River, NJ: Prentice-Hall, 1997.

www.webmonkey.com/backend/protocols/

Just about everything on www.w3.org, if you want the gory details.

A. Source Code for a Basic HTTP Proxy

#if defined(WIN32)

/*
* Ported to Windows by Michael Barr <mbarr@cmp.com> on 3/13/2000.
* Note that you must link with the ws2_32.lib library so the
* WinSock DLL can be found implicitly. Tested with MSVC 5.0.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <winsock2.h>
#include <time.h>

#define close(S) closesocket((S))

#else

/*
* Not Win32. Originally developed for Linux by Bill Gatliff.
*/
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <stdio.h>
#include <signal.h>
#include <time.h>
#include <sys/times.h>

#endif

const char *ver_info = "web-legacy proxy $Id: web-legacy.sgml,v 1.8 2001/09/17 19:44:19 bgat Exp $";

/* larger numbers give more debugging output */
#define DEBUG_VERBOSITY 2

/* make TEXTFIELD nonzero to include a text field
in the target device's home page */
#define TEXTFIELD 0

/* By default, browsers want to connect to port 80
on the server machine. If your PC has a web server
installed already, however, then we can't use it here.
In that case, use port 8000, or some other unused one,
and use an URL like this:
http://localhost:8000/
*/
#define LISTENPORT 8000

/* used when analyzing http messages */
typedef struct {
char *method;
char *object;
char *protocol;
char *connection;
} http_request_T;

const char itoh[] = { '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };

const char *standard_http_header = "http/1.0 200 ok\ncontent-type: text/html\n\n<html>";
const char *standard_http_footer = "</html>\n\n";

/*
Simulates an "localtime" query from the target system.

Under normal circumstances, this function would go talk to
the embedded system using its preferred media, i.e. the
serial port. I don't know what your product's media and
protocols are, though, so this function just simulates
a call that returns the target's calendar time.
*/
const char *proprietary_localtime ( void )
{
static char time_buf[16];
time_t local_time;

local_time = time( 0 );
sprintf( time_buf, "%s", ctime( &local_time ));

return time_buf;
}

/*
Grabs an HTTP message from a socket.
*/
int
read_http_request ( int fd,
char *buf,
int maxlen )
{
int retval;
long terminator_buf = 0L;
char c;

/* retrieve the client's request string */
while( maxlen--
&& recv( fd, &c, 1, 0 ) > 0 ) {

*buf++ = c;

#if ( DEBUG_VERBOSITY == 3 )
fprintf( stderr, "%c [%c%c]\n", c, itoh[c >> 4], itoh[c & 0xf] );
#endif
#if ( DEBUG_VERBOSITY == 2 )
fprintf( stderr, "%c", c );
#endif

/* look for a \r\n\r\n terminator */
terminator_buf = (( terminator_buf << 8 ) + c ) & 0xffffffff;
if( terminator_buf == 0x0d0a0d0a ) {
break;
}
}

/* check for overflow */
if( !maxlen )
retval = -1;

return retval;
}

/*
Our target device's "home page".

Remember, we're a *proxy* for our target's network connection, so
the physical target embedded system doesn't really have a home page.
*/
int
home_page( int connfd )
{
const char *page_top = "<hr><h1>Welcome to the web-legacy proxy!</h1><hr>"
"Local time is: ";
const char *page_bottom = "<form method=\"get\" action=\"query\">"
#if TEXTFIELD != 0
"<input type=\"text\" name=\"textfield\" value=\"default\" size=40>"
#endif
"<input type=submit name=\"pressme\" value=\"more_info\">"
"</form><form method=\"get\" action=\"query\">"
"<input type=submit name=\"pressme2\" value=\"more_info2\">"
"</form>"
;

/* ask the target device what time it is there */
const char *timestr = proprietary_localtime();

/* send the page to the browser */
send( connfd, standard_http_header, strlen( standard_http_header ), 0 );
send( connfd, page_top, strlen( page_top ), 0 );
send( connfd, timestr, strlen( timestr ), 0 );
send( connfd, page_bottom, strlen( page_bottom ), 0 );
send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return 0;
}

/*
Code for the page that gets sent when the user clicks "more info".

This is typical of a proxy for an embedded system--- code that never
talks to the target itself, but only exists to make things easier for
the user (i.e. online help).
*/
int
more_info_page ( int connfd )
{
const char *page = "<hr><h1>Welcome to the web-legacy proxy!</h1><hr>"
"This example code is copyright William A. Gatliff. See the file "
"COPYING for details, or email bgat@open-widgets.com."
"<br><hr>"

"I hope that by now you understand the general ideas behind an "
"HTTP proxy. For more information, see the following references."
"<br><br>"

"The author welcomes feedback and questions via email to "
"bgat@open-widgets.com. Thanks!"
"<hr>"

"Unix Network Programming, by W. Richard Stevens; ISBN 0-13-490012-X<hr>"

"CGI Programming with Perl, by Guelich, Gundavaram and Birznieks; "
"ISBN 1-5692-419-3 (first edition is better, if you can get it)<hr>"

"Just about everything on http://www.w3.org/<hr>";

send( connfd, standard_http_header, strlen( standard_http_header ), 0 );
send( connfd, page, strlen( page ), 0 );
send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return 0;
}

/*
The page we send when we don't know what else to do.
We should probably use HTTP error codes as well (instead
of sending back an OK all the time), but for example purposes
this complication isn't warranted.

Note: I included a "more_info2" link on the home page that
forces the user to this page, even though an error didn't occur.
*/
int
error_page ( int connfd )
{
const char *page = "<hr><h1>Welcome to the web-legacy proxy!</h1><hr>"
"Error! I don't have a page like that, or you pressed the "
"more_info2 button on the home page (I included that dangling link "
"on purpose).</h1><hr>";

send( connfd, standard_http_header, strlen( standard_http_header ), 0 );
send( connfd, page, strlen( page ), 0 );
send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return 0;
}

/*
Cracks apart an http message, then figures out what to do with it.
*/
int
parse_http_request ( int connfd,
char *http_request_buf )
{
http_request_T http_request;

http_request.method = strtok( http_request_buf, " " );
if( strcmp( http_request.method, "GET" ) == 0 ) {

http_request.object = strtok( 0, " " );
http_request.protocol = strtok( 0, " \r\n" );

if( strcmp( http_request.object, "/" ) == 0 )
home_page( connfd );

else if( strcmp( http_request.object,
"/query?pressme=more_info" ) == 0 )
more_info_page( connfd );

else
error_page( connfd );
}

return 0;
}

/*
*/
int
main ( void )
{
int bindres;
int listenfd, connfd;
struct sockaddr_in servaddr;

char http_request_buf[1000];

#if defined(WIN32)

WORD wVersionRequested;
WSADATA wsaData;
int err;

wVersionRequested = MAKEWORD( 2, 0 );

err = WSAStartup( wVersionRequested, &wsaData );
if ( err != 0 ) {
/* Tell the user that we couldn't find a usable */
/* WinSock DLL. */
fprintf( stderr, "Couldn't find a usable WinSock DLL\n" );
return -1;
}

/* Confirm that the WinSock DLL supports 2.0.*/
/* Note that if the DLL supports versions greater */
/* than 2.0 in addition to 2.0, it will still return */
/* 2.0 in wVersion since that is the version we */
/* requested. */

if ( LOBYTE( wsaData.wVersion ) != 2 ||
HIBYTE( wsaData.wVersion ) != 0 ) {
/* Tell the user that we couldn't find a usable */
/* WinSock DLL. */
fprintf( stderr, "Couldn't find a usable WinSock DLL\n" );
WSACleanup( );
return -1;
}

/* The WinSock DLL is acceptable. Proceed. */

#endif

/* initialize servaddr */
memset( &servaddr, 0, sizeof( servaddr ) );

/* connect to TCP/IP port LISTENPORT */
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl( INADDR_ANY );
servaddr.sin_port = htons( LISTENPORT );
listenfd = socket( AF_INET, SOCK_STREAM, 0 );
bindres = bind( listenfd, (struct sockaddr *)&servaddr,
sizeof( servaddr ));

if( bindres != 0 ) {
fprintf( stderr, "Couldn't bind to port %d.\n%s\n",
LISTENPORT,

#if LISTENPORT==80
"(are you running a web server on that port?)"
#else
"(wait a few seconds, then try again)"
#endif
);

exit( 1 );
}

fprintf( stderr, "\n%s\n", ver_info );

while( 1 ) {

/* wait for someone to come along... */
fprintf( stderr,
"\nproxy: listening on port %d...\n",
LISTENPORT );
listen( listenfd, 1 );

/* accept the connection */
connfd = accept( listenfd, (struct sockaddr *)NULL, NULL );

/* read the http request */
read_http_request( connfd, http_request_buf,
sizeof( http_request_buf ));

/* parse and act upon the request */
parse_http_request( connfd, http_request_buf );

/* close out the connection, which
signals to the browser that we're done */
close( connfd );
}

#if defined(WIN32)
WSACleanup();
#endif

return 0;
}
B. Source Code for a State-Driven HTTP Proxy
#if defined(WIN32)

/*
* Ported to Windows by Michael Barr <mbarr@cmp.com> on 3/13/2000.
* Note that you must link with the ws2_32.lib library so the
* WinSock DLL can be found implicitly. Tested with MSVC 5.0.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <winsock2.h>
#include <time.h>

#define close(S) closesocket((S))

#else

/*
* Not Win32. Originally developed for Linux by Bill Gatliff.
*/
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <stdio.h>
#include <signal.h>
#include <time.h>
#include <sys/times.h>

#endif

const char *ver_info = "web-legacy proxy $Id: web-legacy.sgml,v 1.8 2001/09/17 19:44:19 bgat Exp $";

/* larger numbers give more debugging output */
#define DEBUG_VERBOSITY 2

/* make TEXTFIELD nonzero to include a text field
in the target device's home page */
#define TEXTFIELD 0

/* By default, browsers want to connect to port 80
on the server machine. If your PC has a web server
installed already, however, then we can't use it here.
In that case, use port 8000, or some other unused one,
and use an URL like this:
http://localhost:8000/
*/
#define LISTENPORT 8000

/* used when analyzing http messages */
typedef struct {
char *method;
char *object;
char *protocol;
char *connection;
} http_request_T;

typedef struct {
int id;
void (*state)( int connfd, char *http_object );
} http_state_T;

#define HOME 1
#define PAGE2 2
#define PAGE3 3

const char itoh[] = { '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };

const char *standard_http_header
= "http/1.0 200 ok\ncontent-type: text/html\n\n<html>";
const char *standard_http_footer
= "</html>\n\n";

/*
Simulates an "localtime" query from the target system.

In a real example, this function would actually talk to the
target hardware, using some kind of non-IP protocol.
*/
const char *proprietary_localtime ( void )
{
static char time_buf[16];
time_t local_time;

local_time = time( 0 );
sprintf( time_buf, "%s", ctime( &local_time ));

return time_buf;
}

/*
Grabs an HTTP message.
*/
int
read_http_request ( int fd,
char *buf,
int maxlen )
{
int retval;
long terminator_buf = 0L;
char c;

/* retrieve the client's request string */
while( maxlen--
&& recv( fd, &c, 1, 0 ) > 0 ) {

*buf++ = c;

#if ( DEBUG_VERBOSITY == 3 )
fprintf( stderr, "%c [%c%c]\n", c, itoh[c >> 4], itoh[c & 0xf] );
#endif
#if ( DEBUG_VERBOSITY == 2 )
fprintf( stderr, "%c", c );
#endif

/* look for a \r\n\r\n terminator */
terminator_buf = (( terminator_buf << 8 ) + c ) & 0xffffffff;
if( terminator_buf == 0x0d0a0d0a ) {
break;
}
}

/* check for overflow */
if( !maxlen )
retval = -1;

return retval;
}

/*
Our target device's "home page".
*/
void
home_state ( int connfd, char *http_object )
{
char cbuf[1024];

/* this looks ugly, but we're just brute-force building
an html page that includes a proprietary callout and
a couple of choices for the user */
sprintf( cbuf,
"<hr><h1>Welcome to the web-legacy proxy!</h1><hr>"
"Local time is: %s"
"<form method=\"get\" action=\"help\">"
"<input type=\"text\" name=\"textfield\" value=\"type something here\" size=40>"
"<input type=submit name=\"pressme\" value=\"Help!\">"
"<input type=hidden name=\"state_id\" value=\"%d\">"
"</form><form method=\"get\" action=\"query\">"
"<input type=submit name=\"pressme\" value=\"Conclusions\">"
"<input type=hidden name=\"state_id\" value=\"%d\">"
"</form>", proprietary_localtime(), PAGE2, PAGE3 );

/* send the page to the browser */
send( connfd, standard_http_header, strlen( standard_http_header ), 0 );
send( connfd, cbuf, strlen( cbuf ), 0 );
send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return;
}

/*
The "Help!" page.
*/
void
page_2 ( int connfd, char *http_object )
{
char cbuf[1024];

char *query_text = "some text";

/* find the "textfield=" portion of the message */
query_text = strstr( http_object, "textfield=" );
if( query_text ) {

/* get the text that follows "textfield=", terminate it */
query_text = strchr( query_text, '=' ) + 1;
if( query_text )
*strchr( query_text, '&' ) = 0;
}

send( connfd, standard_http_header, strlen( standard_http_header ), 0 );

sprintf( cbuf,
"<hr><h1>Welcome to the \"Help!\" page!</h1><hr><br>"
"This example code is copyright William A. Gatliff. See the "
"file COPYING for details. You can also email questions and "
"comments to <a href=mailto:bgat@open-widgets.com>"
"bgat@open-widgets.com</a><br><hr>" );

send( connfd, cbuf, strlen( cbuf ), 0 );

#if defined(WIN32)
sprintf( cbuf,
"You got here because the proxy received the message "
"\"state_id=%d\", due to your clicking on the appropriate "
"location on the previous page. In response, the proxy "
"invoked the function found in the table <b>http_states[]</b>"
".<hr>", PAGE2);
#else
sprintf( cbuf,
"You got here because the proxy received the message "
"\"state_id=%d\", due to your clicking on the appropriate "
"location on the previous page. In response, the proxy "
"invoked the function <b>%s()</b> because that's what "
"the information in the table <b>http_states[]</b> told "
"it to do.<hr>", PAGE2, __FUNCTION__ );
#endif

send( connfd, cbuf, strlen( cbuf ), 0 );

sprintf( cbuf,
"Sure, this page won't win any awards for beauty or style, "
"but I'm not much of an HTML programmer, either. The essential points "
"should be clear by now, though. And besides, wouldn't you prefer I "
"spend my time on the technical discussion, anyway? :^)<br><br>" );

send( connfd, cbuf, strlen( cbuf ), 0 );

if( query_text ) {
sprintf( cbuf, "And by the way, on the home page, "
"you typed:<br><br><b>%s</b><br><br><hr>",
query_text );

send( connfd, cbuf, strlen( cbuf ), 0 );
}

sprintf( cbuf,
"For your amusement, I have included a link to "
"<a href=http://www.slashdot.org>slashdot</a>. News for Nerds. "
"Stuff that Matters.<br><br>" );

send( connfd, cbuf, strlen( cbuf ), 0 );

sprintf( cbuf,
"<form method=\"get\" action=\"back\">"
"<input type=submit name=\"pressme\" value=\"Back\">"
"<input type=hidden name=\"state_id\" value=\"%d\">", HOME );

send( connfd, cbuf, strlen( cbuf ), 0 );

send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return;
}

/*
The "Conclusions" page.
*/
void
page_3 ( int connfd, char *http_object )
{

char cbuf[1024];

send( connfd, standard_http_header, strlen( standard_http_header ), 0 );

sprintf( cbuf,
"<hr><h1>Welcome to the \"Conclusions!\" page!</h1><hr><br>"
"This example code is copyright William A. Gatliff. See the "
"file COPYING for details. You can also email questions and "
"comments to <a href=mailto:bgat@open-widgets.com>"
"bgat@open-widgets.com</a><br><hr>" );

send( connfd, cbuf, strlen( cbuf ), 0 );

sprintf( cbuf,
"This application has a grand total of three pages, and you've "
"probably seen them all by now. With a state-driven proxy, however, "
"we don't have to modify any of the proxy's structure when adding or "
"changing pages--- we just have to modify <b>http_states[]</b> "
"to include the new or updated function, and make sure that someone "
"produces a state_id that corresponds to the new page.<br><br>" );

send( connfd, cbuf, strlen( cbuf ), 0 );

sprintf( cbuf,
"Alright, hopefully by now you've got the point. So why are you still "
"sitting there?! Go get your stuff onto the Web!<br><br>" );

send( connfd, cbuf, strlen( cbuf ), 0 );

sprintf( cbuf,
"<form method=\"get\" action=\"back\">"
"<input type=submit name=\"pressme\" value=\"Back\">"
"<input type=hidden name=\"state_id\" value=\"%d\">", HOME );

send( connfd, cbuf, strlen( cbuf ), 0 );

send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return;
}

/*
The page we send when we don't know what else to do.
We should probably use HTTP error codes as well (instead
of sending back an OK all the time), but for example purposes
this complication isn't warranted.
*/
int
error_state ( int connfd )
{
const char *page = "<hr><h1>Welcome to the web-legacy proxy!</h1><hr>"
"Error! I don't have a page like that.</h1><hr>";

send( connfd, standard_http_header, strlen( standard_http_header ), 0 );
send( connfd, page, strlen( page ), 0 );
send( connfd, standard_http_footer, strlen( standard_http_footer ), 0 );

return 0;
}

const http_state_T http_states[] = {
{ HOME, home_state },
{ PAGE2, page_2 },
{ PAGE3, page_3 },
{ 0, 0 }
};

/*
Cracks apart an http message, then invokes the proper
state handler based on what it finds in the message.
*/
void
parse_http_request ( int connfd,
char *http_request_buf )
{
http_request_T http_request;
char *state_idstr;
int state_id;
int wstate;

/* make sure it's a "GET" message; if it isn't,
we don't know what to do with it */
http_request.method = strtok( http_request_buf, " " );
if( strcmp( http_request.method, "GET" ) == 0 ) {

/* crack apart the rest of the request */
http_request.object = strtok( 0, " " );
http_request.protocol = strtok( 0, " \r\n" );

/* find the "state_id=" portion of the message */
state_idstr = strstr( http_request.object, "state_id=" );
if( state_idstr ) {

/* get the number that follows "state_id=" */
state_idstr = strchr( state_idstr, '=' ) + 1;
sscanf( state_idstr, "%d", &state_id );

/* look it up */
for( wstate = 0;
http_states[wstate].id;
wstate++ ) {

/* found it! invoke the state function */
if( http_states[wstate].id == state_id ) {
http_states[wstate].state( connfd, http_request.object );
break;
}
}

if( http_states[wstate].id == 0 )
error_state( connfd );
}

/* there wasn't a "state_id=" in the message;
default to the home page */
else home_state( connfd, 0 );
}

return;
}

/*
*/
int
main ( void )
{
int bindres;
int listenfd, connfd;
struct sockaddr_in servaddr;

char http_request_buf[1000];

#if defined(WIN32)

WORD wVersionRequested;
WSADATA wsaData;
int err;

wVersionRequested = MAKEWORD( 2, 0 );

err = WSAStartup( wVersionRequested, &wsaData );
if ( err != 0 ) {
/* Tell the user that we couldn't find a usable */
/* WinSock DLL. */
fprintf( stderr, "Couldn't find a usable WinSock DLL\n" );
return -1;
}

/* Confirm that the WinSock DLL supports 2.0.*/
/* Note that if the DLL supports versions greater */
/* than 2.0 in addition to 2.0, it will still return */
/* 2.0 in wVersion since that is the version we */
/* requested. */

if ( LOBYTE( wsaData.wVersion ) != 2 ||
HIBYTE( wsaData.wVersion ) != 0 ) {
/* Tell the user that we couldn't find a usable */
/* WinSock DLL. */
fprintf( stderr, "Couldn't find a usable WinSock DLL\n" );
WSACleanup( );
return -1;
}

/* The WinSock DLL is acceptable. Proceed. */

#endif

fprintf( stderr, "\n%s\n", ver_info );

/* initialize servaddr */
memset( &servaddr, 0, sizeof( servaddr ) );

/* connect to TCP/IP port LISTENPORT */
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl( INADDR_ANY );
servaddr.sin_port = htons( LISTENPORT );
listenfd = socket( AF_INET, SOCK_STREAM, 0 );
bindres = bind( listenfd, (struct sockaddr *)&servaddr,
sizeof( servaddr ));

if( bindres != 0 ) {
fprintf( stderr, "Couldn't bind to port %d. %s\n",
LISTENPORT,

#if LISTENPORT==80
"(are you running a web server on that port?)"
#else
"(wait a few seconds, then try again)"
#endif
);

exit( 1 );
}

while( 1 ) {

/* wait for someone to come along... */
fprintf( stderr,
"\nproxy: listening on port %d...\n",
LISTENPORT );
listen( listenfd, 1 );

/* accept the connection */
connfd = accept( listenfd, (struct sockaddr *)NULL, NULL );

/* read the http request */
read_http_request( connfd, http_request_buf, sizeof( http_request_buf ));

/* parse it */
parse_http_request( connfd, http_request_buf );

close( connfd );
}

#if defined(WIN32)
WSACleanup();
#endif

return 0;
}

Notes

  • 1. Not all network firewalls are implemented as proxies.
  • 2. The actual HTTP message is a bit longer than this because it also includes information on the type of browser and operating system you are using and the identity of your machine. The GET is the essential text, however.
  • 3. In some cases, you'll need to use raw IP addresses, for example, http://127.0.0.1/ or http://127.0.0.1:8000/.
 

Site Design by: One Hat Design Studio.