coding

About programming and programming languages (especially Java and C#)

Yo dawg, I heard you like QR codes, so I put a QR code in your QR code.


Many people keep saying that Python or Ruby promote functional programming just because you can pass functions as values. Actually, you can do that in C since 1978.

Alright, these two languages have some features that are also present in functional programming languages and in fact it's not easy to define exactly what functional programming is, but remember that functional programming is declarative as opposed to imperative. (See Programming Language Pragmatics by Michael L. Scott). So, here is a nice quote that describes what functional programming is not:

  • "A Programming Language that is characterized by these three properties - the sequential execution of instructions, the use of variables representing memory locations, and the use of assignment to change the values of variables - is called an imperative language, since its primary feature is a sequence of statements that represents commands, or imperatives."

    Programming Languages: Principles and Practice by Kenneth C. Louden

To summarize: if you assign variables then you are probably not doing functional programming. It's that easy.

I recently looked into Google's V8 JavaScript Engine to learn more about the its design and specifically the implementation of inline caching. It is really amazing how much work was done by the V8 team to make it faster.

If you are not into programming languages and compilers, then just skip the next paragraph and go directly to the last paragraph and the nice graphs below.

  • To summarize it, the usual property lookup of a JavaScript object can be naively implemented with a kind of dictionary, like e.g. a hash map. The way it is done in V8 is to store the properties inside the JavaScript object on the heap. The byte offset that is needed to access the property inside the object is stored in map. This map is shared between JavaScript object with similar memory layout and therefore can be considered a hidden class system. After two successful lookups, the call-side of the accessed property is patched at runtime to omit the lookup and instead add the offset to the object pointer and jump directly to the memory address (in case of a method call). This technique is not easy to implement, but it is really effective as can be seen by the benchmark results below.

There are many different JavaScript benchmarks and I have chosen three of these which are popular, well understood and come from different browser vendors. All tests were performed with an Intel(R) Core(TM) Duo CPU with 1.66GHz and a Linux 2.6.33 operating system. The first browser, Opera, is proprietary and just added for reference. Then, the open source Firefox browser and its built-in JavaScript engine, "SpiderMonkey" was tested both with and without the tracing JIT-Compiler "TraceMonkey" and in two different version. The two most important configurations was V8 with and without inline caching (IC) enabled.


Dromaeo Score (The higher the better)

SunSpider Runtime (The lower the better)

V8 Benchmark Suite Score (The higher the better)

I watched an interesting Google tech talk about Google Native Client and tried it out. Apparently it needs a tool called gclient from the Google Chromium depot_tools. I installed it (using Yaourt in Arch Linux, of course) and was suprised to see some Python files compiled by GCC.

This was done by another Google project called Shed Skin. It's a Python-to-C++ compiler that uses type analysis. I wondered how the performance impact was and wrote a small script that sorts large lists of numbers. The time measurements are shown below:

Setting

  • AMD64 3000+
  • Linux 2.6.32-ARCH x86_64
  • Python 2.6.4
  • Shed Skin 0.3-2
  • Built-in sorting algorithm

Code

#!/usr/bin/python
# -<em>- coding: utf-8 -</em>-

from random import randint
from time import time

data = []
for i in range(1 << 16):
 data.append(randint(0, 255))

start = time()
data.sort()
print time() - start

Running

Using python it's quite obvious:

$ python sort.py

0.0448120727539

or you can just chmod +x sort.py and then enter ./sort.py.

Shed Skin is really simple to use as well:

$ shedskin sort.py

[iterative type analysis..]
[generating c++ code..]

$ make

g++ -O2 -pipe -Wno-deprecated  -I. ...

$ ./sort

0.00659690628052

Results

In the end, Shed Skin was about eight times faster than CPython, which is great, but not as great as I had expected. I wonder how PyPy compares to this.

Implementation 1^16 Integers 2^23 Integers
Python 45ms 8350ms
Shed Skin 7ms 1316ms

Execution time (lower is better)

I had a discussion with my professor because the lecture on middleware covered SOAP but not REST.

In my opinion both have their strengths and weaknesses, but REST is way better in term of chaching. To prove that I implemented a simple server and client both with SOAP and RESTful HTTP. The results won't be very accurate but sufficient for comparing the magnitude of the performance. To make the comparison easier I used Java as programming language for both client and server.

SOAP

Implementing the Server

  1. Download the latest version (2.2.5) of Apache CFX and untar it
  2. Create a new Java project in Eclipse and add all JARs of Apaches CFX and its dependencies as referenced libraries.
  3. Write the interface for the server (it's basically a book collection)

        @WebService
        public interface BookCollection {
            void addBook(Book book);
            Book getBook(int isbn);
            Book[] getAllBooks();
            void editBook(Book book);
            void deleteBook(int isbn);
        }
       
  4. Write the data class

        public class Book {
            public int isbn;
            public String author;
            public String title;
        }
       

    You might want to encapsulate the fields. In Eclipse you can use the menu Source -> Generate Getters and Setters ...

  5. Write the implementation

        @WebService(
            endpointInterface = "de.livoris.restvssoap.BookCollection",
            serviceName = "MyBooks")
        public class MyBookCollection implements BookCollection {

            private Map<Integer, Book> books = new HashMap<Integer, Book>();
       
            @Override
            public void addBook(Book book) {
                books.put(book.getIsbn(), book);
            }

            @Override
            public void deleteBook(int isbn) {
                books.remove(isbn);
            }

            @Override
            public void editBook(Book book) {
                Book old = getBook(book.getIsbn());
                old.setAuthor(book.getAthor());
                old.setTitle(book.getTitle());
            }

            @Override
            public Book[] getAllBooks() {
                return books.values().toArray(new Book[books.size()]);
            }

            @Override
            public Book getBook(int isbn) {
                return books.get(isbn);
            }
        }
       
  6. Writing the actual SOAP-Server (better in a new Java project)

        public class Server {
            public static void main(String[] args) throws IOException {
                System.out.println("Starting Server");
                MyBookCollection myBooks = new MyBookCollection();
                String address = "http://localhost:9000/myBooks";
                Endpoint.publish(address, myBooks);
                System.in.read();
            }
        }
       
  7. The WSDL of the SOAP service is now accessible through http://localhost:9000/myBooks?wsdl.

Implementing the Client

  1. Apache CFX has a tool for generating a proxy from the WSDL. To generate the SOAP proxy for the book collection set the environment variables JAVA_HOME, CFX_HOME and PATH correctly, then type:

        mkdir -p BookSOAPClient/src
        wsdl2java -d BookSOAPClient/src http://localhost:9000/myBooks?wsdl
       

    You might want to rename the package afterwards.

  2. Now you need a list of books. An easy way is to download this CSV file and to import it using the opencsv library and the following code:

        private List<Book> books = new LinkedList<Book>();
       
        public void importFromFile(String path) throws IOException {
            CSVReader reader = new CSVReader(new FileReader(path));
            String [] nextLine = reader.readNext(); // skip first line
            while ((nextLine = reader.readNext()) != null) {
                // Reject books with no or invalid ISBN
                if (nextLine[1].matches("[0-9]+")) {
                    Book b = new Book();
                    b.setAuthor(nextLine[0]);
                    b.setIsbn(Integer.parseInt(nextLine[1]));
                    b.setTitle(nextLine[2]);
                    books.add(b);
                }
            }
        }
       
  3. Implement a dummy client that will add a book from the CSV file and then request all books until all books have been added.

        private BookCollection collection = MyBooks.getMyBookCollectionPort();
        private int adds = 0;
        private int gets = 0;

        public void work() {
            Book current;
            for (Book current : books) {
                collection.addBook(current); adds++;
                Book[] serverBooks = collection.getAllBooks(); gets++;
                for (Book b : serverBooks) {
                    collection.getBook(b.getIsbn()); gets++;
                  }
            }
        }
       
  4. Run the main method, that will import the CSV data, start the dummy client and measure the execution time.

        public static void main(String[] args) throws IOException {
            Client client = new Client();
            client.importFromFile("etc/books.csv");
            long start = System.currentTimeMillis();
            client.work();
            long end = System.currentTimeMillis();
            System.out.println("Execution time = " + (end - start) + "ms");
        }
       
  5. The result of running the client is shown below:

    Indicator Value
    Add requests 78
    Get requests 3197
    Time 25742ms

Caching

SOAP always uses HTTP POST requests for each message, therefore its neccessary for the caching web service to read the whole HTTP request including its body, to open the provided SOAP envelope and to unmarshall the XML that describes the message. In the end responding with a cached response is not different from normal responding. If you use Apache Axis or ASP.NET then you can enable caching inside your service class but if you want to use a caching proxy the only SOAP webservice cache I found was Ventus Proxy.

REST

Implementing the Server

  1. Add marshall and unmarshall methods to the Book class. For this example I use JSON with the Java library from json.org.

        public static Book fromJSON(JSONObject object) throws JSONException {
            Book result = new Book();
            result.setIsbn(object.getInt("isbn"));
            result.setAuthor(object.getString("author"));
            result.setTitle(object.getString("title"));
            return result;
        }
       
        public JSONObject toJSON() throws JSONException {
            JSONObject result = new JSONObject();
            result.put("isbn", getIsbn());
            result.put("author", getAuthor());
            result.put("title", getTitle());
            return result;
        }
       
  2. The next step is to create a the actual Server with the Java library Restlet.

    To do this, you have to create two different resources. One for the book collection and one for a single book. The resource for the book collection supports adding books and getting all books.

        public class BookCollectionResource extends ServerResource {
           
            // the underlying book collection
            protected static BookCollection books = new MyBookCollection();
           
            @Post
            public void add(Representation entity) throws Exception {
                JsonRepresentation rep = new JsonRepresentation(entity);
                Book book = Book.fromJSON(rep.getJsonObject());
                books.addBook(book);
                setStatus(Status.SUCCESS_CREATED);
            }
           
            @Get
            public JsonRepresentation getAll(Representation entity) throws Exception {
                JSONArray jsonArray = new JSONArray();
                for (Book book : books.getAllBooks()) {
                    jsonArray.put(book.toJSON());
                }
                setStatus(Status.SUCCESS_OK);
                return new JsonRepresentation(jsonArray);
            }
        }
       
  3. The resource for a single book uses the isbn parameter of the request to find the book and supports get, update and delete.

        public class BookResource extends ServerResource {
            // the underlying book object
            private Book book;
           
            @Override
            protected void doInit() throws ResourceException {
                super.doInit();
                setDimensions(new HashSet<Dimension>(Arrays.asList(Dimension.MEDIA_TYPE)));
                int isbn = Integer.parseInt(((String) getRequest().getAttributes().get("isbn")));
                book = BookCollectionResource.books.getBook(isbn);
            }
           
            @Get
            public JsonRepresentation getIt(Representation entity) throws Exception {
                if (book == null) {
                    setStatus(Status.CLIENT_ERROR_NOT_FOUND);
                    return new JsonRepresentation(new JSONObject());
                } else {
                    setStatus(Status.SUCCESS_OK);
                    JsonRepresentation rep = new JsonRepresentation(book.toJSON());
                    Calendar now = Calendar.getInstance();
                    now.add(Calendar.MINUTE, 30);
                    rep.setExpirationDate(now.getTime())); // assume that a book won't change very often
                    return rep;
                }
            }
           
            @Put
            public void storeBook(Representation entity) throws Exception {
                if (book == null) {
                    setStatus(Status.CLIENT_ERROR_NOT_FOUND);
                    return;
                }
                JsonRepresentation rep = new JsonRepresentation(entity);
                Book updated = Book.fromJSON(rep.getJsonObject());
                book.setTitle(updated.getTitle());
                book.setAuthor(updated.getAuthor());
            }
           
            @Delete
            public void deleteBook() {
                if (book == null) {
                    setStatus(Status.CLIENT_ERROR_NOT_FOUND);
                } else {
                    BookCollectionResource.books.deleteBook(book.getIsbn());
                    setStatus(Status.SUCCESS_NO_CONTENT);
                }
            }
        }
       
  4. Finally the code to start the server is needed.

        public static void main(String[] args) throws Exception {
            Component component = new Component();
            component.getServers().add(Protocol.HTTP, 9001);
            component.getDefaultHost().attach("/books", BookCollectionResource.class);
            component.getDefaultHost().attach("/books/{isbn}", BookResource.class);
            component.getLogService().setEnabled(false);
            component.start();
        }
       
  5. You can use your browser to access the books by navigating to http://localhost:9001/books.

Implementing the Client

  1. You need to create a new Java Project and reference the Project with the Book and the BookCollection class, as well as Restlet and JSON jar files.

  2. Create a class that implements the BookCollection interface and that uses the Restlet library and the methods toJSON and fromJSON of the Book class to create HTTP requests.

        public class BookCollectionProxy implements BookCollection {
           
            private static final String SERVICE_URL = "http://localhost:9001/books";
           
            ClientResource books = new ClientResource(SERVICE_URL);
           
            @Override
            public void addBook(Book book) {
                try {
                    Representation rep = new JsonRepresentation(book.toJSON());
                    books.post(rep);
                } catch (JSONException e) {
                    e.printStackTrace();
                }
            }
           
            @Override
            public void deleteBook(int isbn) {
                ClientResource res = new ClientResource(SERVICE_URL + "/" + isbn);
                res.delete();
            }
           
            @Override
            public void editBook(Book book) {
                try {
                    ClientResource res = new ClientResource(SERVICE_URL + "/" + book.getIsbn());
                    Representation rep = new JsonRepresentation(book.toJSON());
                    res.put(rep);
                } catch (JSONException e) {
                    e.printStackTrace();
                }
            }
           
            @Override
            public Book[] getAllBooks() {
                try {
                    JsonRepresentation json = new JsonRepresentation(books.get());
                    JSONArray array = json.getJsonArray();
                    Book[] result = new Book[array.length()];
                    for (int i = 0; i < array.length(); i++) {
                        result[i] = Book.fromJSON(array.getJSONObject(i));
                    }
                    return result;
                } catch (Exception e) {
                    printStackTrace();
                }
                return null;
            }
           
            @Override
            public Book getBook(int isbn) {
                try {
                    ClientResource res = new ClientResource(SERVICE_URL + "/" + isbn);
                    JsonRepresentation rep = new JsonRepresentation(res.get());
                    return Book.fromJSON(rep.getJsonObject());
                } catch (Exception e) {
                    e.printStackTrace();
                    return null;
                }
            }
        }
       
  3. You can now use the client with the dummy client as shown above with REST.

    Indicator Value
    Add requests 78
    Get requests 3159
    Time 23873ms

Caching

Because a REST request is nothing more then a normal HTTP request, every caching HTTP proxy can be used to improve the performance. In the example I use Apache HTTP Server.

  1. The first step is to install Apache. If you use linux then apache2 will propably be in a package repository, otherwise you might download the source from the official website or a distribution like XAMPP.

  2. Configure apache2 to act as an caching proxy.

    The following configuration create a virtual host, that will serve at http://localhost:9002/ and will forward request that are not cached to http://localhost:9001/.

        Listen 9002

        <VirtualHost *:9002>
            Servername localhost
           
            Location /
                Order allow,deny
                Allow from all
            /Location
           
            ProxyRequests Off
           
            ProxyPass / http://localhost:9001/
            ProxyPassReverse / http://localhost:9001/
           
            # Enable memory caching
            CacheEnable mem http://localhost:9001/books
           
            # Limit the size of the cache to 16 Megabyte
            MCacheSize 16384
        </VirtualHost>
       
  3. After changing the SERVICE_URL in the BookCollectionProxy to http://localhost:9002/books all request go through the Apache HTTP Server that will automatically cache some of the books and returning their JSON representation without even contacting the Restlet Server. This speeds up the processing as shown in the table below.

    Indicator Value
    Add requests 78
    Get requests 3159
    Time 16632ms

Results

In both examples Jetty (a small embeddable web server) was used and not a full blown application server like Tomcat that would probably be much faster. Comparing the three solutions SOAP, REST and REST-Caching the last one has by far the best performance. And that is exactly why I think REST is a very good solution for scalable web applications even though REST has disadvantages because it is very restricted and therefore sometimes hard to use for a general purpose interface.


Performance comparison between the presented solutions

"Service Calling made easy" by Geek and Poke
© 2010 Christopher Schuster. Drupal theme by Kiwi Themes.