You will use this file of 4 million random phone numbers with possible repeats in args[0]. In addition, will be given an output filename provided inargs[1] into which you will write the sorted list of phone numbers without duplicates.
Each phone number begins with the digit 2 or greater so phone numbers are 2000000 through 9999999. There are no area codes or country code prefixes to deal with. The file looks something like:
9827019
5655875
2593305
8586163
6115967
9128969
9998369
8551496
...
Here are some declarations to get you started:
Scanner inFile = new Scanner(new FileReader(args[0]));
PrintWriter outFile = new PrintWriter(args[1]); byte[] bitmap = new byte[1000000];
You will use a bitmap as an array of 1,000,000 bytes (that is, the type byte). Each bit in the array of one million bytes will represent one phone number. The bit will be turned on if and only if the number was read from the file. Those bits that remain off will indicate the number was never found in the file.
You will want to review Bitwise Operators.
Using this bitmap association will guarantee no repeats in the resultant file and as a side effect automatically sorts the data since you will process the bitmap sequentially from beginning to end inspecting every bit and writing the phone number represented by that bit to the output file if the bit is turned on.
Process all values before writing to the output file.
Your output file should have all the found numbers with no duplicates in ascending, sorted order.
As a guide d2.dat is provided above and is 32,000,000 bytes. Your result file should be 25,195,720 bytes on Unix or 28,345,185 in Windows/DOS. Values other than this indicate an error in processing and you may want to