Translate

Thursday, March 23, 2017

Java HashMap Search and Sort

A HashMap has plenty of uses, so let's see how you can find keys and values, load data from a CSV file into one, and use it for sorting.

Introduction

Having explored HashMap in several previous articles (here and here), let us now learn how to search and sort a HashMap by key as well as value.

Finding Keys and Values in HashMap

Finding a key in a HashMap is quite simple. The HashMap API provides the containsKey() method, which tells you whether the key exists.
Map<String,Integer> namefreq = new HashMap<>();
namefreq.put("Petra", 14);
namefreq.put("Mario", 11);
namefreq.put("Kasandra", 23);
namefreq.put("Charity", 18);
namefreq.put("Minerva", 5);
if ( namefreq.containsKey("Charity") ) {
}

Finding a value is also easy given the method containsValue().
if ( namefreq.containsValue(10) ) {
}

Pretty simple, right? Well, what if you need to find not a particular value but search for a general expression, such as names starting with say “A”. Gets a bit more involved.

Loading CSV Into HashMap

The following search and sort examples make use of a name frequency table, which is loaded from CSV using the following code. It streams lines from a CSV file, splits the line into fields, selects values for the “CA” state, and stores the value as a Map of (year => count). (The data is from US Census Bureau which publishes name frequency counts for each year.)
The code creates a multi-HashMap with the structure:
(name => ((year => count),
          (year => count)),
...

Pattern pattern = Pattern.compile(",");
String csvFile = "StateNames.csv";
try (BufferedReader in = new BufferedReader(new FileReader(csvFile));){
    Map<String,Map<Integer,Integer>> namefreq = in
    .lines()
    .skip(1)
    .map(x -> pattern.split(x))
    .filter(x -> x[4].equals("CA"))
    .collect(HashMap::new, (map, x) ->
         map.compute(x[1], (k, v) -> {
             if ( v == null )
                 v = new HashMap<Integer,Integer>();
             v.put(Integer.parseInt(x[2]),
                   Integer.parseInt(x[5]));
             return v;
             }),
         Map::putAll);
}

Here is a snippet of the sample data being loaded.
Id,Name,Year,Gender,State,Count
1,Mary,1910,F,AK,14
2,Annie,1910,F,AK,12
3,Anna,1910,F,AK,10
4,Margaret,1910,F,AK,8
5,Helen,1910,F,AK,7
6,Elsie,1910,F,AK,6
7,Lucy,1910,F,AK,6
8,Dorothy,1910,F,AK,5
9,Mary,1911,F,AK,12
10,Margaret,1911,F,AK,7

Search Key in HashMap

The following code searches for a key ending with “x” and prints the matches.
namefreq
    .entrySet()
    .stream()
    .filter(e -> e.getKey().endsWith("x"))
    .forEach(e -> {
        System.out.println(e.getKey() + " :");
        e.getValue().forEach((kk, vv) -> {
            System.out.println(" " + kk + " => " + vv);
        });
    });
Rex :
 1959 => 86
Margaux :
 2003 => 8
Maxx :
 2010 => 28

Search Value in HashMap

Let us now search for a value in the HashMap using the same method. Before we do that, let us total the names for each year and store it under the key 0 for each name. Something like this:
Elza :
  0 => 10
  1993 => 5
  2014 => 5
...

We compute the total value with the following code. It computes a total of the existing mappings of (year => count) and stores it under key 0.
namefreq
    .entrySet()
    .stream()
    .forEach(e -> {
        Map<Integer,Integer> v = e.getValue();
        int tot = v
        .entrySet()
        .stream()
        .reduce(0, (x, ee) -> x + ee.getValue(),
            (x, y) -> x+y);
        v.put(0, tot);
    });

Let us now search the HashMap for names occurring more than 1000 times.
namefreq
    .entrySet()
    .stream()
    .filter(e -> e.getValue().getOrDefault(0, 0) > 1000)
    .forEach(e ->
         System.out.println(e.getKey()+" : "+e.getValue().get(0)));
...
Solomon : 1967
Javier : 28472
Esther : 16974
Lenora : 1261
Sam : 6971
Lenore : 1261
Rowan : 1297
Lukas : 2888
...

The actual search is being performed by the following segment:
.filter(e -> e.getValue().getOrDefault(0, 0) > 1000)

Here, you can use an expression of arbitrary complexity to search for exactly what you need. In the following example, we search for key containing the string “mon” and total value more than 1000. The results are stored in another HashMap (for further processing).
Map<String,Map<Integer,Integer>> subset = namefreq
    .entrySet()
    .stream()
    .filter(e ->  e.getKey().contains("mon")&&e.getValue().getOrDefault(0, 0) > 1000)
    .collect(HashMap::new,
         (m, e) -> m.put(e.getKey(), e.getValue()),
         Map::putAll);
subset.forEach((k, v) -> System.out.println(k + " : " + v.get(0)));
Simone : 3114
Desmond : 2498
Ramona : 8139
Raymond : 60506
...

Sort HashMap by Key

To sort a HashMap by key and print the mappings, we can do something like this.
namefreq
    .entrySet()
    .stream()
    .sorted((x, y) -> x.getKey().compareTo(y.getKey()))
    .forEach(e -> {
        System.out.println(e.getKey() + " :");
        e.getValue().forEach((kk, vv) -> {
            System.out.println(" " + kk + " => " + vv);
        });
    });
Lylia :
 0 => 6
 2009 => 6
Lyliana :
 0 => 15
 2007 => 5
 1998 => 5
 1999 => 5
...

To store the result (sorted map), we use a LinkedHashMap (which preserves the insertion order) as follows.
Map<String,Map<Integer,Integer>> subset = namefreq
    .entrySet()
    .stream()
    .sorted((x, y) -> x.getKey().compareTo(y.getKey()))
    .collect(LinkedHashMap::new,
         (m, e) -> m.put(e.getKey(), e.getValue()),
         Map::putAll);
subset.forEach((k, v) -> System.out.println(k + " : " + v.get(0)));
Byanca : 90
Byanka : 79
Byran : 65
Byron : 7254
Cache : 10
Cadance : 30
Cade : 1874
...

Sort HashMap by Value

Sorting the HashMap by value is very similar to the above example. The following code sorts the name-count HashMap by total-count and prints the top 20 names.
namefreq
    .entrySet()
    .stream()
    .sorted((x, y) ->
        y.getValue().get(0).compareTo(x.getValue().get(0)))
    .limit(20)
    .forEach(x -> System.out.println(x.getKey() + " => " +
                     x.getValue().get(0)));
Michael => 422157
David => 364853
Robert => 347637
John => 310120
James => 274168
Daniel => 244229
Richard => 222633
Christopher => 215728
William => 209173
Anthony => 174064
...

Summary

Searching for a key in a HashMap involves applying a filtering pipeline to the entries. The same method is also applicable to search for the HashMap values. In fact, arbitrarily complex expressions can be used with such a pipeline for search. We also learned how to sort the HashMap by keys or values and possibly store the results in a LinkedHashMap.

See Also



Source: Java HashMap Search and Sort