Computer Science paper help C++ programming

For this programming lab, you are to implement a complete, one-program Huffman encoding / decoding system. The name of your program MUST be HUFF.exe. Your program will have five different modes in which it can operate, and those modes will be determined by the command-line arguments (see below). There will be three kinds of files your program will need to be able to read / write (depending on the mode): 1) Input Files. For the moment, let’s assume these will be text files, so I’ll use the extension .txt.

Note, however, that your program should also be able to successfully encode any file of any type. It must work just as well on a text file or a binary file, like a video or an executable program.

2) Encoded files. These will be (Huffman-compressed) files encoded by your program. Let’s use the extension .huf for these.

3) Tree-Building information files. These will contain the information required to build a Huffman tree from some (particular) input file. These will always be 510 bytes. Their extension will be .htree.

The five modes your program is to support are given by the command-line syntax definitions below: Mode 1 – Show Help: Syntax: HUFF –h HUFF -? HUFF –help Description: Displays the proper usage options to the user and exits. Mode 2 – Encode Directly From Input File: Syntax: HUFF –e file1 [file2] Description: Encode file1, placing the output into file2. Your program is to read file1, build a Huffman tree from the file, and then produce file2, whose first 510 bytes will be the tree-building information, and the remainder of which will be the Huffman-encoded bit stream created from file1, padded as we discussed in class to fill the last byte. Note that file2 is optional (as indicated by its being in square brackets in the command-line syntax). If the user omits file2, you are to take file1, and remove its extension, and append .huf. If file2 is omitted and file1 has no extension, then just append .huf to the name of file1. If file2 is specified, use its name as-supplied, even if the user did not specify an extension. File1 and file2 must not refer to the same file. Examples: HUFF –e shakespeare.txt Encode Shakespeare.txt into Shakespeare.huf HUFF –e hamlet.txt hamlet.enc Encode Hamlet.txt into Hamlet.enc HUFF –e hamlet Encode Hamlet into Hamlet.huf

Mode 3 – Decode: Syntax: HUFF –d file1 file2 Description: Decode Huffman-encoded file1 into file2. File1 will contain a 510-byte tree-builder block, followed by the Huffman-encoded bit stream for some original input file (perhaps the output from –e above). Your program will read in the tree-builder block, construct the tree, and use it to decode file1, placing the result into file2. File2 should be bit-identical to the original input file. Note that file2 is not optional (it’s not in square brackets in the command-line syntax) – the user must specify both file names, and again, both file names must not refer to the same file. Examples: HUFF –d hamlet.enc hamlet2.txt Decode hamlet.enc into hamlet2.txt HUFF –d shakespeare.huf all-plays.txt Decode Shakespeare.huf into all-plays.txt

Mode 4 – Create A Tree-Building File: Syntax: HUFF –t file1 [file2] Description: Your program will read file1, and use the information in it to produce a 510-byte tree- builder file named file2. As with the encoding mode, file2 is optional (it’s in square brackets). If file2 is not specified and file1 has an extension, replace it with .htree. If file2 is not specified and file1 does not have an extension, append .htree to file1 to produce the filename for file2. If file2 is specified, then use file2 as-supplied. Both file names must not refer to the same file. Examples: HUFF –t shakespeare Produces 510-byte shakespeare.htree HUFF –t shakespeare.txt shakespeare.htree Produces 510-byte shakespeare.htree HUFF –t hamlet hamlet.510 Produces 510-byte hamlet.510 Mode 5 – Encoding with Specified Tree-Builder: Syntax: HUFF –et file1 file2 [file3] Description: Your program will read input from file1, and, rather than building a tree from the contents of file1, encode it using a tree built from the 510-byte tree-builder information in file2, producing file3 (note: file3 is optional, and will default to the same filename as file1, except with an extension of .huf, which will be either changed or appended to file1’s name as appropriate). The first 510 bytes of file3 will be the tree-builder information from file2, followed by the Huffman-encoded bit stream from file1. Examples: HUFF –et Shakespeare.txt bible.htree Produces encoded shakespeare.huf from Shakespeare.txt, using the tree- builder information in bible.htree

HUFF –et hamlet.txt Hunchback.htree hamlet.huf Produces encoded hanlet.huf from hamlet.txt, using the tree-builder information in hunchback.htree.

The following end-to-end example will show how the system can be used. Suppose we have text files Hunchback.txt, Shakespeare.txt, and Hamlet.txt. If we read all of Shakespeare.txt and build a tree (saving the resulting tree-building information in Shakespeare.htree), we can then use that tree to encode Shakespeare.txt (which should be a “perfect” fit, and give maximal compression). If we use the same tree to encode Hamlet.txt, the fit should be very good (but not ideal), as both are Elizabethan prose. Likewise, if we build a tree from Hamlet.txt, it should encode all of Shakespeare’s works (Shakespeare.txt) very well, but not quite optimally. If we use the Shakespeare tree to encode Hunchback.txt, the compression should not be nearly as good as if we encode Hunchback directly. The following fourteen commands will allow us to test this system rather throroughly: HUFF –t Shakespeare.txt HUFF –t Hamlet.txt HUFF –e Shakespeare.txt Shakespeare1.huf HUFF –et Shakespeare.txt hamlet.htree Shakespeare2.huf HUFF –e Hamlet.txt Hamlet1.huf HUFF –et Hamlet.txt shakespeare.htree Hamlet2.huf HUFF –e hunchback.txt hunchback1.huf HUFF –et hunchback.txt shakespeare.htree hunchback2.huf HUFF –d Shakespeare1.huf shakespeare1.txt HUFF –d Shakespeare2.huf shakespeare2.txt HUFF –d hamlet1.huf hamlet1.txt HUFF –d hamlet2.huf hamlet2.txt HUFF –d hunchback.huf hunchback1.txt HUFF -d hunchback2.huf hunchback2.txt After executing these 14 commands, we should have:

• Two 510-byte tree-builders (Shakespeare.htree and Hamlet.htree) • Encoded files Shakespeare1.huf, which should be a little smaller than Shakespeare2.huf • Encoded files Hamlet1.huf, which should be a little smaller than hamlet2.huf • Encoded files Hunchback1.huf, which should be a little smaller than Hunchback2.huf • Bit-identical copies of Shakespeare.txt, Shakespeare1.txt, and Shakespeare2.txt • Bit-identical copies of Hamlet.txt, Hamlet1.txt, and Hamlet2.txt • Bit-identical copies of Hunchback.txt, Hunchback1.txt, and Hunchback2.txt

In all of these examples, I have shown (for simplicity) the files as residing in the same directory as the executable, HUFF.exe. This is not a requirement. As long as the files actually exist, they can be located anywhere, if we use sufficiently qualified paths (with the appropriate drive / directory specifications): HUFF –d D:\hunchback.huf R:\decoded\hunchback.txt (In this example, I am obviously assuming that there is a file called hunchback.huf in the root of the D: drive, and that there is a folder on the R: drive called decoded, and that the program is allowed to write to that folder. If an input file cannot be found or opened, or an output file cannot be created, display an error message telling the user what went wrong and exit without going any further.). All valid runs of the program should not output anything other than their elapsed time as decimal seconds, with three decimal places (i.e., precise to the millisecond, but displayed as a number like “1.234 seconds”), along with the number of bytes input (read) and output (written). There should be no debugging output statements in your code, and no cute headers like “Welcome to Huffland”. Your program is not to prompt the user for any information; what it needs will come purely from the command line.

Sample output might be: Time: 3.456 seconds. 5,053,475 bytes in / 3,456,789 bytes out All code you write for this assignment must be yours and yours alone (except for what is in the lecture slides and the Savitch text, of course). You may not use any code from any other source, including the Internet, even as a reference. Using code other than what you write (or are explicitly permitted to use) will be considered academic dishonesty, and will be dealt with in most severe terms.

Your program should utilize good programming practices – information hiding, etc. Your public methods should not give away HOW Huffman works – main() should never see your Huffman Tree’s root pointer, for example. You may have to create some public methods that, in turn, call private methods to get the job done. Your program should contain a class called Huffman, with the following public methods to support the five command-line modes:

MakeTreeBuilder(string inputFile, string OutputFile) EncodeFile(string inputFile, string outputFile) DecodeFile(string inputFile, string outputFile) EncodeFileWithTree(string inputFile, string TreeFile, string outputFile) DisplayHelp() Obviously, there will likely be some additional private methods, and some class-level variables (and perhaps arrays that will need to be used within the class?), but those can all be private, so main() sould only need to interact with the Huffman class through its public methods. It might make sense for your main() function to call validateCommandLineParameters(int argc, char* argv) (NOT part of Huffman.cpp) to SEE if the user has specified valid command-line arguments (and flesh out the missing ones, for those that are optional), and then simply call the appropriate Huffman methods. That modular design keeps main() short, and allows you to debug the command-line processing incrementally.

Your code must be well-documented:

• Every source file (.cpp/.h) must have a header block with the file’s name, the date, course, author, and a brief description of what’s IN the file.

• Use block comments at the start of each method, briefly explaining what it does, and how it does what it does. Method-header comments should start on the first line after the opening brace for a method (not above the method header). This allows the header comment to disappear if you collapse the method. Please follow the Allman style of brace placement

• Use line comments liberally to explain what’s going on. You may want short block comments here and there to introduce / explain particular segments of the code.

• Use internal documentation, like descriptive variable names where appropriate • You may NOT use any STL components, including, but not limited to, vectors, bitsets, hashmaps,

or other classes from the library that will do work for you. Strings, arrays, and file streams are all you should need for this one.

Some helpful starting points:

• Create your program as a Windows Console application within VS 2017 (see PPT slides on BB)

• Your program should use the std namespace; NOT the System namespace

• To get access to cin and cout, use “#include <iostream>”

• To get access to the string class and its supporting code, “#include <string>”

• You will want to use stream file I/O operations to read / write your files. That means your input file will be an input file stream (ifstream), and your output file will be an output file stream (ofstream). You have already #included <iostream> in order to get access to cin and cout, but you will need to #include <fstream> in order to get access to ifstream and ofstream. The Savitch text will be a helpful resource for this part! Also consult the Visual Studio online documentation.

• You won’t be doing any math on the characters – everything will be logical (bit) operations, so your chars should all be unsigned. This may require some casting considerations.

DO NOT WAIT TO GET STARTED ON THIS – YOU WILL END UP WRITING SEVERAL HUNDRED LINES OF COMPLEX, HARD-TO-DEBUG CODE FOR THIS ASSIGNMENT; IT’S TOO BIG TO DO AT THE LAST MINUTE, AND THERE WON’T BE ANY EXTENSIONS GIVEN ON THIS ASSIGNMENT – WE WILL HAVE ANOTHER BIG ONE RIGHT BEHIND IT!

IF YOU RUN INTO PROBLEMS, AND HAVE MADE A GOOD ATTEMPT AT SOLVING THEM, E-MAIL ME, RATHER THAN WASTING TIME LOST IN THE WILDERNESS

To turn in your program, use 7-Zip (www.7-zip.org) to create a compressed archive of your entire Visual Studio workspace for this project (don’t just submit your source and/or .exe files). Submit your 7- Zip archive to Blackboard.

As for testing, you may want to create some test scripts, and store them in batch files. A batch file is a text file containing multiple commands, which can be executed en masse by simply invoking the name of the batch file. If the fourteen commands from page 3 are stored in a file called TEST.BAT, located in the same directory as HUFF.exe, then you can simply type TEST from the command line (assuming there’s not also a TEXT.exe in the same directory!), and the commands in that file will run as if typed. If you want to compare two files to see if they are bit-identical copies, use the Windows system command fc (file compare), and run it in binary mode: Syntax: fc /b file1 file2 Example: fc /b shakespeare.txt shakespeare1.txt fc /b shakespeare.txt shakespeare2.txt If all three files are identical, then you should see “FC – no differences found” appear after each run. The commands to run fc can also be embedded in your batch file.

http://www.7-zip.org/
If you also want to include in your batch file the commands to display the file sizes (after all, we’re looking for Shakespeare1.huf to be at least a little shorter than Shakespeare2.huf), you could use: dir /on shakespeare?.huf dir /on hamlet?.huf dir /on hunchback?.huf The ? character is a single-character “wildcard” match, which means that the first line above will display the directory listings of shakespeare1.huf and shakespeare2.huf. Since the first one should be the smaller of the two, we want it to display first, and the “/on” option tells the system to sort the directory listings by name (/on = “order by name”) There may be other Windows system commands that might help you with this one. Type “help” from a command window to see a list of commands available. Even if none of these are directly applicable, knowing a bit about batch files and how to automate the operation of a PC from program to program can be a valuable skill! This project will be graded primarily on correctness – if your code doesn’t work correctly, then it won’t get many points. If it won’t even compile, then it gets a zero, regardless of how many lines of code you turn in. There will be some subjective points for code style and documentation, but the majority of the points will come from whether or not I can get your program to produce incorrect results for any of my test scripts. Consequently, it will behoove you to create a number of test scenarios that test various possibilities! There will also be a subjective portion of the grade related to documentation, structure, and efficiency. If your program takes a very inefficient approach, and takes too long to run, that will cost a few points, even if the output is correct.

Order from us and get better grades. We are the service you have been looking for.

Order essay

Type of paper needed:

Academic level:

Deadline:

Pages:

1650 words

Total price: $0.00