Design Search Autocomplete System
Design a search autocomplete system for a given list of sentences. The system should return the top 3 most frequent sentences that start with a given prefix.
Constraints:
- The input sentences will be a list of strings.
- The input prefix will be a string.
- The output should be a list of the top 3 most frequent sentences that start with the given prefix.
Examples:
Input: ["The quick brown fox jumps over the lazy dog", "The dog runs quickly", "The quick dog runs"], "The"
Output: ["The quick brown fox jumps over the lazy dog", "The dog runs quickly", "The quick dog runs"]
Explanation: The top 3 most frequent sentences that start with the prefix "The" are returned.
Solutions
Trie with Hash Table
The solution uses a Trie data structure to store the sentences and a hash table to store the frequency of each sentence. The search function uses a depth-first search to find all sentences that start with the given prefix and then sorts them based on their frequency.
class AutocompleteSystem {
private TrieNode root;
private HashMap<String, Integer> frequency;
public AutocompleteSystem(String[] sentences) {
root = new TrieNode();
frequency = new HashMap<>();
for (String sentence : sentences) {
TrieNode node = root;
for (char c : sentence.toCharArray()) {
if (!node.children.containsKey(c)) node.children.put(c, new TrieNode());
node = node.children.get(c);
}
node.end = true;
frequency.put(sentence, frequency.getOrDefault(sentence, 0) + 1);
}
}
public List<String> search(String prefix) {
TrieNode node = root;
for (char c : prefix.toCharArray()) {
if (!node.children.containsKey(c)) return new ArrayList<>();
node = node.children.get(c);
}
List<String> sentences = new ArrayList<>();
dfs(node, prefix, sentences);
sentences.sort((a, b) -> frequency.get(b).compareTo(frequency.get(a)));
return sentences.subList(0, Math.min(3, sentences.size()));
}
private void dfs(TrieNode node, String sentence, List<String> sentences) {
if (node.end) sentences.add(sentence);
for (char c : node.children.keySet()) {
dfs(node.children.get(c), sentence + c, sentences);
}
}
}
Follow-up:
How would you optimize the solution to handle a large number of sentences?