Java Program To Make Frequency Count Of Words

Posted on by
Child Development Frequency Count

Question: Write a Java program to make frequency count of words in a given text. Solution: The program takes a text and a word as input and displays the frequency of.

Question: Write a Java program to make frequency count of words in a given text. Solution: The program takes a text and a word as input and displays the frequency of. I am trying to create a program as a tutorial for. Create word count of text using hashmap. Browse other questions tagged java for-loop count hashmap or ask.

You'll have to look at the source files to see how/why you are parsing that blank space - your split method parameter should deal with it automatically. How can i make my code to neglect certain words which are in an file say stop.txt? You can read stop.txt into an array (just like reading the words in your existing code). You can then write a tiny method that tells you if a given word is in that array or not, and you can then use that method to decide which words to neglect. There are classes in the Java API that will make this easier, if you want to learn/use them. Eg read the words in stop.txt into an ArrayList, then you can use its contains method to see if it contains a given word.

A bit of a digression, but I've been practicing with the new features in Java 8 (due March), and I have to share this with anyone who's interested. The following code creates a map of counts for all the unique words (case insensitive) in all the.txt files in a specified folder. Path dirPath = Paths.get('c://testdata'); Map counts = Files.list(dirPath). // parallel(). Filter(path ->path.toString().toLowerCase().endsWith('.txt')). FlatMap(path ->bufferedReaderFromPath(path).lines()).

FlatMap(line ->Stream.of(line.split(' s+'))). Filter(word ->word.length() >0). Map(word ->word.toLowerCase()). Collect(Collectors.groupingBy(Function.identity(), Collectors.counting())); more amazingly: just uncomment the parallel() call and it will automatically split and run this in a suitable number of parallel threads! Java 8 is really the biggest thing since 1.5 - maybe even bigger. Ps: bufferedReaderFromPath(path) is just a cover for BufferedReader(new FileReader(path.toFile())) but because lambdas can't throw arbitrary Execptions, it needed a wrapper method to deal with any FileNotFoundExceptions.

You have made this too hard by building a List of arrays. It will be much simpler if you just have a List containing one word per entry. So at line 8, instead of adding the whole array to the list, use a small loop to add all the words in the array to the list one at a time. Now you have a list of words, you can simply use stopList.contains(someWord) to see if someWord is in the list. Ps Line 7 does nothing - toString returns a string representation of the array (its type etc, not its contents), but you don't do anything with that returned value. That code creates a new LinkedHashMap.

LinkedHashMap is interesting because it remembers the order that its elements were added in. The method starts by getting a List of all the entries (key/value pairs) from your Map, sorts them by value, according to the new Comparator, then adds them to the LinkedHashMap, so the LinkedHashMap is now in the same order that the List. At this point you should have noticed that you have no need of the LinkedHashMap at all for this application. Everything you need is in the sorted List.

Rotis Sans Serif Light Font. Printing the first 'n' entries from the list is trivial, but since you want the highest counts, not the lowest, you need to change the comparator. I recognised that code immediately, so there's a serious chance that your teacher will as well, which may not help your final grade!

If I were you, I would now write my own highy simplified version of that, which just does what's needed for this exercise. Load Crack Trivial Pursuit Family Edition. Get rid of all the generics and hard-code the types from your own HashMap. Fix the comparator to sort descending rather than ascending. Get rid of the LinkedHashMap and just use the List. That will also prove that you understood what you were doing and din't just copy something blindly.

J Edited 4 Years Ago by JamesCherrill.

If this does not need to be super-fast just create an array of integers, one integer for each letter (only alphabetic so 2*26 integers? Or any binary data possible?).

Go through the string one char at a time, get the index of the responsible integer (e.g. If you only have alphabetic chars you can have 'A' be at index 0 and get that index by subtracting any 'A' to 'Z' by 'A' just as an example of how you can get reasonably fast indices) and increment the value in that index. There are various micro-optimizations to make this faster (if necessary).