java - How to sort an ArrayList which contain more than 1000 different Strings on the basis of similairy to another given String -
i have arraylist contains 1000 strings. want sort list on basis of similarity given string outside. strings close string come in top.
for eg. have string "beauty , beast".
my arraylist contains string like:
redwall
beauty , beast 3
bluewall
beautyqueen i
beast of rome ii
beauty , beast 1
beast beauty
bluewall 2
beautyqueen ii
beast of rome i
beauty , beast 2
...
so after sorting arraylist, should like..
beauty , beast 1
beauty , beast 2
beauty , beast 3
beast beauty
beautyqueen i
beautyqueen ii
beast of rome i
beast of rome ii
bluewall
bluewall 2
redwall
some thing this.. dont know how order going after beauty , beast 3.. should pick string have same string in beginning.
i looking algorithm can me in implementing task in java.
i have heard using levenstein distance, have no idea on how can used task.
any pointers lot of help.
i have created custom comparator per need , here's code
s
search string, matching/closely matching strings
should appear first- i have created
set<string> matches
store tokens(words) of search string - i have created comparator
c
has methodgetscore(string)
, gives score per number of matches found in given string of list against search string - if
getscore
method returns0
both strings of list or if both strings have same number of matches, sorting them in natural ordering. else promoting string has highest match returning -ve
list<string> l = new arraylist<string>(); l.add("redwall"); l.add("beauty , beast 3"); l.add("bluewall"); l.add("beautyqueen i"); l.add("beast of rome ii"); l.add("beauty , beast 1"); l.add("beast beauty"); l.add("bluewall 2"); l.add("beautyqueen ii"); l.add("beast of rome i"); l.add("beauty , beast 2"); string s = "beauty , beast"; //search string final set<string> matches = new hashset<string>(); for(string tokens : s.split("\\s")) { matches.add(tokens.tolowercase()); //convert search string tokens } comparator<string> c = new comparator<string>() { @override public int compare(string o1, string o2) { int scorediff = getscore(o1) - getscore(o2); if((getscore(o1) == 0 && getscore(o2) == 0) || scorediff == 0) { return o1.compareto(o2); } return - (getscore(o1) - getscore(o2)); } private int getscore(string s) { int score = 0; for(string match : matches) { if(s.tolowercase().contains(match)) { score++; } } return score; } }; collections.sort(l, c); for(string ss : l) { system.out.println(ss); }
and here's output
beauty , beast 1 beauty , beast 2 beauty , beast 3 beast beauty beast of rome beast of rome ii beautyqueen beautyqueen ii bluewall bluewall 2 redwall
Comments
Post a Comment