Problem:
Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Machine 1 (sender) has the function:
string encode(vector<string> strs) { // ... your code return encoded_string; }
Machine 2 (receiver) has the function:
vector<string> decode(string s) { //... your code return strs; }
So Machine 1 does:
string encoded_string = encode(strs);
and Machine 2 does:
vector<string> strs2 = decode(encoded_string);
strs2
in Machine 2 should be the same as strs
in Machine 1.
Implement the encode
and decode
methods.
Note:
- The string may contain any possible characters out of 256 valid ascii characters. Your algorithm should be generalized enough to work on any possible characters.
- Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless.
- Do not rely on any library method such as
eval
or serialize methods. You should implement your own encode/decode algorithm.
Analysis:
This problem needs some skills in implementation. Once you know the tricky skill underlying it, you would think how it could be so easy! Instant idea: Can you use some special characters to separate those strings. Nope! No matter what kind of special characters you use, it may appear in each individual string by chance! Then I have came up with the idea to use certain number of characters to record each string's information in the overall string. However, how much prefix characters is enough? how to sepearte the information for each string out? That's a headache problem! The genius idea: why not combinely use special character and size information. Wrap your string in following way in the encode string. encode_string = size1:{original_string}size2:{original_string}size3:{original_string}size4:{original_string} each original string is wrap through following way: original_string ---> size1:{original_string} For a single block, how could we extract the orginal_string out of wraped string? Step 1: get the start index of the block. Inital start index is 0. ------------------------------------------------------------------- int next_start = 0; Step 2: use ":" to get the orginal_string's length. ------------------------------------------------------------------- int split_index = s.indexOf(":", next_start); int len = Integer.valueOf(s.substring(next_start, split_index)); Step 3: combinely use ":" and length information to extract the original string out. ------------------------------------------------------------------- String item = s.substring(split_index+1, split_index+1+len); ret.add(item); Step 4: update the start index for the next string. ------------------------------------------------------------------- next_start = split_index+1+len;
Wrong Solution:
public class Codec { // Encodes a list of strings to a single string. public String encode(List<String> strs) { if (strs == null) throw new IllegalArgumentException("strs is null"); StringBuffer buffer = new StringBuffer(); for (String str : strs) { buffer.append(str.length()); buffer.append(":"); buffer.append(str); } return buffer.toString(); } // Decodes a single string to a list of strings. public List<String> decode(String s) { List<String> ret = new ArrayList<String> (); int next_start = 0; int split_index = s.indexOf(":"); int len = Integer.valueOf(s.substring(next_start, split_index)); while (next_start < s.length()) { String item = s.substring(split_index+1, split_index+1+len); ret.add(item); next_start = split_index+1+len; split_index = s.indexOf(":", next_start); len = Integer.valueOf(s.substring(next_start, split_index)); } return ret; } }
Mistakes Analysis:
Last executed input: [] Mistake Analysis: My first implementation is complex and so ugly!!! Since we need to do the same work for all wrapped strings, we should not allow a singly operation spill out the common block. int next_start = 0; int split_index = s.indexOf(":"); //what if there is no string in the encoded string!!! This ugly logic incure a corner case! int len = Integer.valueOf(s.substring(next_start, split_index)); while (next_start < s.length()) { String item = s.substring(split_index+1, split_index+1+len); ret.add(item); next_start = split_index+1+len; split_index = s.indexOf(":", next_start); len = Integer.valueOf(s.substring(next_start, split_index)); } What's more, "while (next_start < s.length())" is great checking for cases!
Solution:
public class Codec { // Encodes a list of strings to a single string. public String encode(List<String> strs) { if (strs == null) throw new IllegalArgumentException("strs is null"); StringBuffer buffer = new StringBuffer(); for (String str : strs) { buffer.append(str.length()); buffer.append(":"); buffer.append(str); } return buffer.toString(); } // Decodes a single string to a list of strings. public List<String> decode(String s) { List<String> ret = new ArrayList<String> (); int next_start = 0; while (next_start < s.length()) { int split_index = s.indexOf(":", next_start); int len = Integer.valueOf(s.substring(next_start, split_index)); String item = s.substring(split_index+1, split_index+1+len); ret.add(item); next_start = split_index+1+len; } return ret; } } // Your Codec object will be instantiated and called as such: // Codec codec = new Codec(); // codec.decode(codec.encode(strs));