Nicolas My's development blog: I once was lost but now I'm found(ed)

Let's implement hash tables in Java

2013-10-24 23:51:23

OK this is reinventing wheel but it's kind of fun so let's do it.

Just like in java.util.HashMap, hash tables are a data structure used to store elements using keys.

I'll do it the plain an simple way :
* Object and Key arrays are used to store elements
* Object hashcode() methods is used to get the key hash value.
* Open-addressing is used to resolve collisions : means that if an element with the same hash is found, we proceed to the next available space in our array.
* Putting two object with the same key ends in the same space.
* If table is full then we create an array twice the size and put all our stored objects back into it.

Little trick : arrays size is a power of 2 so that modulo can be replaced by a bitmask operation.

 
public class HashMap <V> {
    Object[] keys;
    Object[] values;
 
    int initialSize = 4;
    int currentSize = initialSize;
 
    public HashMap() {
        values = new Object[currentSize];
        keys = new Object[currentSize];
    }
 
    public void put(Object key,V value) {
        int pos = findStorageIndice(key.hashCode());    
        if (pos == -1) {
            resizeBucket(currentSize<<1);
            put(key,value);
        } else {
            keys[pos] = key;
            values[pos] = value;
        }
    }
 
    @SuppressWarnings("unchecked")
    private void resizeBucket(int newSize) {
        Object oldkeys[] = keys;
        Object oldvalues[] = values;
        currentSize = newSize;
        values = new Object[currentSize];
        keys = new Object[currentSize];
 
        for (int i = 0; i < oldkeys.length;i++) {
            put(oldkeys[i],(V)oldvalues[i]);
        }
    }
 
 
    private int findStorageIndice(Object key) {
        int startPos = key.hashCode() & (currentSize-1);
        if ((keys[startPos] == null)||(keys[startPos].equals(key))) {
            return startPos;
        }
 
        for (int i = (startPos+1) & (currentSize-1);i != startPos;i=(i+1) & (currentSize-1)) {
            if ((keys[i] == null)||(keys[i].equals(key))) {
                return i;
            }
        }
        return -1;
    }
 
    private int getIndice(Object key) {
        int indice = key.hashCode() & (currentSize-1);
        if (key.equals(keys[indice])) {
            return indice;
        }
        for(int i = (indice+1) & (currentSize-1);i != indice;i= (i+1) & (currentSize-1)) {
            if (key.equals(keys[i])) {
                return i;
            }
        }
        return -1;
    }
 
    @SuppressWarnings("unchecked")
    public V get(Object key) {
        int indice = getIndice(key);
        if (indice == -1) {
            return null;
        } else {
            return (V)values[indice];
        }
    }
}

We start with an initial bucket size of 4. The code is simple and fairly compact, so I did make a comparison with the reference Java implementation.

 
{
    public static int getRandInt() {
        int lower = 0;
        int higher = 100000000;
 
        int random = (int)(Math.random() * (higher-lower)) + lower;
        return random;
    }
 
    public static void main(String argv[]) {        
        long millis  = System.currentTimeMillis();
        System.out.println("Launching custom HashMap insertion");        
        HashMap<String> map = new HashMap<String>();
        for (int i = 0;i < 1000000;i++) {
            Integer iobj = new Integer(getRandInt());
            map.put(iobj, "Value="+iobj);
            map.get(iobj);
        }
        long currentmillis  = System.currentTimeMillis();
        System.out.println("Elapsed : " +(currentmillis - millis) + "ms");
 
 
        millis  = System.currentTimeMillis();
        System.out.println("Launching reference HashMap insertion");        
        java.util.HashMap<Integer,String> map2 = new java.util.HashMap<Integer,String>();
        for (int i = 0;i < 1000000;i++) {
            Integer iobj = new Integer(getRandInt());
            map2.put(iobj, "Value="+iobj);
            map2.get(iobj);
        }
        currentmillis  = System.currentTimeMillis();
        System.out.println("Elapsed : " +(currentmillis - millis) + "ms");                
    }
}

I'll evaluate performance comparing by storing and retrieving 1 million strings with integer as keys .

 
Launching custom HashMap insertion
Elapsed : 5871ms
Launching reference HashMap insertion
Elapsed : 507ms

Huge performance gap, at least it shows how optimized the java framework implementation is.
One important optimization is not to let the arrays getting filled as collisions are expensive and tends to occur massively as the backing array is getting crowded, therefore we will enlarge our bucket array when filled to a certain percentage.

=> 75% (as in java implementation) gives us this result :

 
Launching custom HashMap insertion
Elapsed : 1127ms

=> 50%

 
Launching custom HashMap insertion
Elapsed : 970ms

=> 30%

 
Launching custom HashMap insertion
Elapsed : 685ms

First element of a partition group in Oracle.

2013-10-22 11:14:36

Always useful :

 
select *  from(
  SELECT dev.RREC_COD,DEV.TYPE_EVT_CODE ,
    ROW_NUMBER() OVER (PARTITION BY dev.RREC_COD ORDER BY dev.DATE_EVT) as rnum  
  from ZZ dev
  where 
    EXTRACT(YEAR FROM dev.DATE_EVT) = 2012
  ) t
where t.rnum = 1
  and t.RREC_COD = 1234

Open-source magic

2013-10-02 00:42:51

I've just seen this message in kivy mailling list.

 
On Sat, 13 Oct 2012, Waffle wrote: 
 
# Is there a simple camera scrolling example for kivy that is available? I am
 trying to make a side scroller however I cannot find a simple example to follow. 
I # looked into the kobr landing zone game, but it was too hard to understand and 
I spent a few hours trying to implement its mechanisms into my game (no luck)

I'm happy that someone took interest in my 1 month project LandingZone. https://github.com/kobr4/LandingZone (Yay the begining of fame !)
Looking back at it, yes the code is quite obfuscated, with heavy use of object-oriented programming style : inheritance, overload, method override.
Eventually, as often with object-oriented, objects are not reusable and code logic not so easy to grasp.

Another point is that without explicit type declaration, variables names has to be self explanatory, I understand that function which signature is like : "def focus(self,p1,p2)" can be hard to use.

Let's request Yahoo YQL API in Java

2013-09-29 23:58:17

Yahoo YQL API( http://developer.yahoo.com/yql/ ) offers free access to interesting data like weather forecast or finance quotes.
Here is how to get started with requesting data and parsing the JSON result set using Jackson.

The request endpoint is : http://query.yahooapis.com/v1/public/yql

 
    private static String baseUrl = "http://query.yahooapis.com/v1/public/yql?q=";
 
    private static String buildRequest(String request) {
 
        try {
            return baseUrl + URLEncoder.encode(request, "UTF-8")+"&format=json";
        } catch (UnsupportedEncodingException e) {
            return baseUrl + URLEncoder.encode(request) +"&format=json";
        }
    }

note the "format=json" parameter, otherwise, Yahoo API default to xml.

We want to retrieve data about San Francisco :

 
String request = 
  buildRequest("select * from geo.places where text=\"san francisco, ca\"");

Note how we use some pseudo sql language, hence it's called YQL.

 
        ObjectMapper sMapper = new ObjectMapper();
        sMapper.configure(org.codehaus.jackson.map.
DeserializationConfig.Feature.FAIL_ON_UNKNOWN_PROPERTIES, false);  
 
        String request = buildRequest("select * from geo.places where
 text=\"san francisco, ca\"");
        InputStreamReader is = null;
        Request yqlrequest = null;
        try {
            is = fetchRessource(request);
            yqlrequest = sMapper.readValue(is, Request.class);
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } finally {
            if (is != null)
                try {
                    is.close();
                } catch (IOException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
        }
 
        if ((yqlrequest != null)&&(yqlrequest.query != null)) {
            System.out.println("PLACE :"+yqlrequest.query.result.placeList.size());
        } else {
            System.out.println("Objet null !");
        }

Jackson, among other things, can do object mapping. Therefore, we have to define our receiving object. Here is the response from the API:
http://query.yahooapis.com/v1/public/yql?q=select+*+from+geo.places+where+text%3D%22san+francisco%2C+ca%22&format=json+

Our "Request" object hold a "Query" object that holds a "Results" object that holds "placeList"

The we can create our object to be mapped :

 
public class Request {
 
    @JsonProperty("query") 
    public Query query;
}

Mapping is define through annotations, it's also possible to map on setters when using proper encapsulation.

 
public class Query {
    @JsonProperty("results") 
    Result result;
}

 
public class Result {
    @JsonProperty("place")
    public List<Place> placeList;
}

 
public class Place {
    @JsonProperty("name") 
    public String name;
}

Try without catch in Java

2013-09-04 18:35:44

There's something to learn every day.

It's possible to make a try statement without catching anything, for instance if something has to be done in the finally but we still need to propagate the exception to the calling function :

 
try {
  preparedStatement.executeQuery();
}
catch(SQLException e){
  throw e;
} 
finally {
  prepardedStatement.close();
}

Can be replaced by:

 
try {
  preparedStatement.executeQuery();
}
finally {
  prepardeStatement.close();
}

Java templating engine : Freemarker

2013-08-30 00:08:41

Often comes the need to create dynamic strings, that is strings that are computed depending of program values.
One of the most common example is dynamic SQL requests : joins are made or not depending on whether a form field has been filled or not.

 
String A = "...";
String B = "...";
A + a?B:"";

As code base grows, this kind of code is hard to read and maintain.

Template engines comes in handy as it's much more elegant solution, I used Freemarker to create an html view of the content of directory in java :

 
    public static String getDirectory(String path) {        
        File directory = new File(path);
 
        Map<String, Object> data = new HashMap<String, Object>();
        data.put("path", path);
        List<String> files = new ArrayList<String>();
 
        if (!directory.isDirectory())
        return null;
 
        if (directory != null) {
            String [] fileList; 
 
            int i; 
            fileList = directory.list(); 
            if (fileList != null) {
                for(i=0 ;i<fileList.length;i++) { 
                    files.add(fileList[i]);
                }
            }
        }        
 
        data.put("files", files);
 
        Configuration cfg = new Configuration();
 
        Template template;
        try {
            template = cfg.getTemplate("src/directoryhtml.ftl");
 
            StringWriter out = new StringWriter();
 
            template.process(data, out);
 
            return out.toString();    
 
        } catch (IOException e) {
            e.printStackTrace();
        } catch (TemplateException e) {
            e.printStackTrace();
        }
 
        return null;

The file directoryhtml.ftl is a template that uses our passed "files" and "path" data to create the final file.

 
<html>
  <body>
    <#list files as file>
      <a href="${path}${file}">${file}</a><br>
    </#list>
  </body>
</html>

Getting started with Netty 4.0

2013-08-27 22:53:00

Netty 4.0 is an "asynchronous event-driven network application framework". Pretty much like NodeJS, it allows development of low-level networking applications in an event driven manner.

I've found available tutorials to be confusing, so let's start by the plain and simple "EchoServer".

First the class that will start our service :

public class EchoServer {
       private final int port;
 
        public EchoServer(int port) {
            this.port = port;
        }
 
        public void run() {
 
            EventLoopGroup bossGroup = new NioEventLoopGroup(); // (1)
            EventLoopGroup workerGroup = new NioEventLoopGroup();            
 
            // Configure the server.
            ServerBootstrap bootstrap = new ServerBootstrap();
 
            bootstrap.group(bossGroup, workerGroup)
             .channel(NioServerSocketChannel.class) // (3)
             .childHandler(new ChannelInitializer<SocketChannel>() { // (4)
                 @Override
                 public void initChannel(SocketChannel ch) throws Exception {
                     ch.pipeline().addLast(new EchoServerHandler());
                 }
             })
             .option(ChannelOption.SO_BACKLOG, 128)          // (5)
             .childOption(ChannelOption.SO_KEEPALIVE, true); // (6)
 
            // Bind and start to accept incoming connections.
            bootstrap.bind(new InetSocketAddress(port));
        }
 
        public static void main(String[] args) throws Exception {
            int port;
            if (args.length > 0) {
                port = Integer.parseInt(args[0]);
            } else {
                port = 8080;
            }
            new EchoServer(port).run();
        }
}

This will start our server and wait forever on the bound port for incoming connexions.

Then our server handler :

public class EchoServerHandler extends ChannelInboundHandlerAdapter {
    @Override
    public void channelActive(ChannelHandlerContext ctx) throws Exception {
        String s = "HELLO.\n";
        final ByteBuf bbHello = ctx.alloc().buffer(s.getBytes().length);
        bbHello.writeBytes(s.getBytes());
        ctx.write(bbHello);
        ctx.flush();
    }
 
    @Override
    public void channelRead(ChannelHandlerContext ctx, Object msg) { // (2)
        ctx.write(msg);
        ctx.flush();     
    }
}

On channel activation, we send "HELLO." to the incoming client, then when something received from the sender, we write it directly back.

Our new server can be tested by using the telnet command :
telnet localhost 8080

Quicksort : my partition function

2013-08-26 05:57:38

Been trying to implement the quicksort algorithm, to sum up, a sorting algorithm that consists in spliting the sorted array recursively by placing sorted values left or right of an arbitrary choosen pivot, whether the value is above or below pivot value. Then starting again with the left and right array that we obtained.

The heart of this algorithm really is the partition function.

I've come up with this version :

 
    private static int partition(int tab[], int i, int j) {    
        int p = tab[i];
        boolean right = true;
        while (i < j) {
 
            if (right) {
 
                if (p > tab[j]) {
                    int tmp = tab[j];
                    tab[j] = tab[i];
                    tab[i] = tmp;
                    right = false;
                    i++;
                } else {            
                    j--;
                }
            }
 
            if (!right) {
 
                if (p <= tab[i]) {
                    int tmp = tab[j];
                    tab[j] = tab[i];
                    tab[i] = tmp;
                    right = true;
                    j--;
                } else {            
                    i++;
                }
            }            
        }    
 
        return i;
 
    }

We say that the pivot is the first element of the array, then start at the right :if element is below pivot then exchange elements and then look right (else it means that the element is at the right place so we continue to the next right element).
Basically, the pivot element is travelling the left to right partition depending on the previous swaped element, until it reaches the center.

Later on, I've implemented the partition algorithm that appear in wikipedia :

 
    private static int partition2(int tab[], int i, int j) {
        int pivot_value = tab[i];
        int store_index = i;
        swap(tab,i,j);
        for (int id = i; id < j;id++) {
            if (tab[id] < pivot_value){
                swap(tab,id,store_index);
                store_index++;
            }
        }
        swap(tab,store_index,j);
        return store_index;
    }

This algorithm is fully linear : store_index is the index of the partition.
When an element below the pivot is found : this element is swapped with the element in front of the store_index (which is part of the right partition and therefore, should be above pivot). The store_index is increased by one, as it gained a new element.

The calling recursive function:

 
    public static void quickSort(int tab[], int i, int j) {
        if (i < j) {
            int q = partition2 (tab,i,j);    
            quickSort(tab,i,q);    
            quickSort(tab,q+1,j);
        }
    }

partition2 is proven to be twice as fast as partition, advantage is only significant on 100000+ element to sort, I guess mainly because linear access to memory is faster.

Note that complexity for the quicksort algorithm is expected to be : O(nlog n), but worst case scenario is O(n^2).
In an interview : one could mention the merge-sort algorithm which ensure O(n log n) complexity.
Quicksort advantage is that it sort elements in place.

Fast modulo on power of 2

2013-08-23 00:43:51

A nice optimisation when calculating modulor for a power of 2 :

 
x % 512 = x & (512 - 1)

That is to say that the remainder is given by the lower-order bits of x.

Some words about the Play framework

2013-08-20 00:06:25

http://www.playframework.com/

I've been toying around with the Play framework for a few hours and I must say that it's an interesting piece of work.

At work, I'm still using "old school" java web framework : tomcat/struts/spring/jsp and in this regard, Play is quite refreshing. In fact it feels so different, that there's almost nothing in common except for the Java language usage.

Play is lightweight and elegant :

* Creating a new project from scratch is only a matter of writing a single command-line.

* Hot swap : everything from Java code to configuration files can modified without need to restart the webserver.

* Ditch the xml : xml was a nice idea but ended up leading to endless configuration files filled with unreadable tags.
Plain text is the way to go !

* Dicth the JSP : jsp is not so bad at what it does but like the xml file format, mixing html and xml tag is not particulary easy to read, moreove it's quite limited semanticaly (no procedure / limited branching) => Scala as a full fledged scripting language come's here as an elegant solution.

With the good, comes the bad :

* Far from the standard, meaning :
Support if it's broken ?
Having to completely rethink web developpement in Java.
Can't it play well with other Java eco-system libraries ?

* No persistence API or service architecture
At least, Play leaves us free to use whatever fits our needs.

See 10 older news | Back to the top