| 07-29-2008, 01:06 AM | #1 |
I've been hacking at the wave and huffman decompression code in libmpq for about a day now. The huffman stuff is in a particularly horrible state [ most evil line ever: ht->first = (tree_item*)&ht->first ]. I've managed to get a readable implementation working, but there are errors in the outputs. The wave files I decompress sound correct, except for some sort of chopping sound in the background. I was wondering if anyone knew other projects with source code for these mpq decompression algos. |
| 07-29-2008, 08:54 AM | #2 |
The recently posted Sfmpq library should have it all working I guess. |
| 07-29-2008, 09:38 AM | #3 |
I actually looked and couldn't find it anywhere in the source code. It looks like it delegates that to storm.dll. |
| 07-29-2008, 08:45 PM | #4 |
take a look at StormLib |
| 07-30-2008, 04:57 AM | #5 |
Excellent! The code in that library clearly came from the same source as what I was looking at, but had more work done on it. Now decompression is working like a charm! -- Now it's time to start refactoring until the code looks much nicer. |
| 07-30-2008, 08:14 PM | #6 |
SFmpq uses the same zezula sources with the evil huffman (most likely it was copied off asm code ><) I guess you could try using some known huffman algorithms and see if they work. I got tired of rewriting code so I just copied the zezula sources for pkzip huffman and wave into my library. |
| 07-30-2008, 11:55 PM | #7 |
Did you spend much time trying to understand the huffman decompression? It's insane!. Setting a variable to its own address, cast to the variable's original type, should be illegal. I looked all over the code trying to find out why ht->last was changing, and it was because if ht->first = &ht->first then ht->first->prev points at ht->last . Changing the order of variables in the structs breaks the code because of that.In any case I managed to make my own implementation of it. Once you start really taking out the cruft, it drops down to ~300 lines. |
| 07-31-2008, 02:49 AM | #8 |
I'm interested in seeing your final result. A working readable implementation of MPQ decompression code would be fantastic. I mean, hell, there could even be comments in there too. |
| 07-31-2008, 08:31 AM | #9 |
I'll post it once I get it cleaned up. There are still little problems here and there. I can extract almost all the wave files from the war3 mpq, but some of them still don't work (eg. one of the hero lich responses). I still don't do bzip, zlib. |
| 07-31-2008, 09:20 AM | #10 |
Again, the zezula code is most likely just a quick analysis of storm.dll disassembly, and "quick analysis" implies that a lot of things will be guessed wrong and will look ugly. I said I gave up trying to make it look nice. zlib and bzip are easy because there are libraries for them, but I haven't figured out where to get a dll'less bzip2 library (the one I downloaded ended up requiring a dll). |
| 08-04-2008, 06:51 AM | #11 | |
Quote:
Funny thing is we had a similar question in our computing exam for uni, 3/4 of the people got it wrong. People that right that sought of code should be shot, its like trying to follow a cheetah on speed and rocket boosters |
| 08-08-2008, 02:14 AM | #12 |
Here's the class I use to decompress huffman. It doesn't stand on its own, but it's a hell of a lot clearer than what I based it on. Language is VB2005.net. Code:
'''<summary>Exposes a plaintext IO.Stream around a huffman compressed substream</summary>
Public Class MPQHuffmanStream
Inherits ConversionStream
'''<summary>The frequency tables used to construct the initial huffman trees</summary>
Public Shared ReadOnly frequencyTables()() As UInteger = { _
array(Of UInteger)( _
10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2 _
), _
array(Of UInteger)( _
84, 22, 22, 13, 12, 8, 6, 5, 6, 5, 6, 3, 4, 4, 3, 5, 14, 11, 20, 19, 19, 9, 11, 6, 5, 4, 3, 2, 3, 2, 2, 2, _
13, 7, 9, 6, 6, 4, 3, 2, 4, 3, 3, 3, 3, 3, 2, 2, 9, 6, 4, 4, 4, 4, 3, 2, 3, 2, 2, 2, 2, 3, 2, 4, _
8, 3, 4, 7, 9, 5, 3, 3, 3, 3, 2, 2, 2, 3, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 2, 1, 2, 2, _
6, 10, 8, 8, 6, 7, 4, 3, 4, 4, 2, 2, 4, 2, 3, 3, 4, 3, 7, 7, 9, 6, 4, 3, 3, 2, 1, 2, 2, 2, 2, 2, _
10, 2, 2, 3, 2, 2, 1, 1, 2, 2, 2, 6, 3, 5, 2, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 3, 1, 1, 1, _
2, 1, 1, 1, 1, 1, 1, 2, 4, 4, 4, 7, 9, 8, 12, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 3, _
4, 1, 2, 4, 5, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 4, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
2, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, 2, 6, 75 _
), _
array(Of UInteger)( _
0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 39, 0, 0, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
255, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 6, 14, 16, 4, 6, 8, 5, 4, 4, 3, 3, 2, 2, 3, 3, 1, 1, 2, 1, 1, _
1, 4, 2, 4, 2, 2, 2, 1, 1, 4, 1, 1, 2, 3, 3, 2, 3, 1, 3, 6, 4, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, _
1, 41, 7, 22, 18, 64, 10, 10, 17, 37, 1, 3, 23, 16, 38, 42, 16, 1, 35, 35, 47, 16, 6, 7, 2, 9, 1, 1, 1, 1, 1 _
), _
array(Of UInteger)( _
255, 11, 7, 5, 11, 2, 2, 2, 6, 2, 2, 1, 4, 2, 1, 3, 9, 1, 1, 1, 3, 4, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, _
5, 1, 1, 1, 13, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, _
10, 4, 2, 1, 6, 3, 2, 1, 1, 1, 1, 1, 3, 1, 1, 1, 5, 2, 3, 4, 3, 3, 3, 2, 1, 1, 1, 2, 1, 2, 3, 3, _
1, 3, 1, 1, 2, 5, 1, 1, 4, 3, 5, 1, 3, 1, 3, 3, 2, 1, 4, 3, 10, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
2, 2, 1, 10, 2, 5, 1, 1, 2, 7, 2, 23, 1, 5, 1, 1, 14, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
6, 2, 1, 4, 5, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 17 _
), _
array(Of UInteger)( _
255, 251, 152, 154, 132, 133, 99, 100, 62, 62, 34, 34, 19, 19, 24, 23 _
), _
array(Of UInteger)( _
255, 241, 157, 158, 154, 155, 154, 151, 147, 147, 140, 142, 134, 136, 128, 130, _
124, 124, 114, 115, 105, 107, 95, 96, 85, 86, 74, 75, 64, 65, 55, 55, _
47, 47, 39, 39, 33, 33, 27, 28, 23, 23, 19, 19, 16, 16, 13, 13, _
11, 11, 9, 9, 8, 8, 7, 7, 6, 5, 5, 4, 4, 4, 25, 24 _
), _
array(Of UInteger)( _
195, 203, 245, 65, 255, 123, 247, 33, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
191, 204, 242, 64, 253, 124, 247, 34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
122, 70 _
), _
array(Of UInteger)( _
195, 217, 239, 61, 249, 124, 233, 30, 253, 171, 241, 44, 252, 91, 254, 23, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
189, 217, 236, 61, 245, 125, 232, 29, 251, 174, 240, 44, 251, 92, 255, 24, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
112, 108 _
), _
array(Of UInteger)( _
186, 197, 218, 51, 227, 109, 216, 24, 229, 148, 218, 35, 223, 74, 209, 16, _
238, 175, 228, 44, 234, 90, 222, 21, 244, 135, 233, 33, 246, 67, 252, 18, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
176, 199, 216, 51, 227, 107, 214, 24, 231, 149, 216, 35, 219, 73, 208, 17, _
233, 178, 226, 43, 232, 92, 221, 21, 241, 135, 231, 32, 247, 68, 255, 19, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, _
95, 158 _
) _
}
Private Class HuffmanNode
Public leftChild As HuffmanNode = Nothing
Public rightChild As HuffmanNode = Nothing
Public parent As HuffmanNode = Nothing
Public val As Integer = -1
Public freq As UInteger = 0
Public Sub New(ByVal val As Integer, ByVal freq As UInteger)
Me.val = val
Me.freq = freq
End Sub
Public Sub New(ByVal leftChild As HuffmanNode, ByVal rightChild As HuffmanNode)
Me.leftChild = leftChild
Me.rightChild = rightChild
leftChild.parent = Me
rightChild.parent = Me
Me.freq = leftChild.freq + rightChild.freq
End Sub
End Class
'''<summary>A binary tree with the property that the sum over the leafs of depth*frequency is minimized</summary>
Private Class HuffmanTree
Public ReadOnly nodes As New List(Of HuffmanNode) 'sorted list of the nodes in the tree
Public leafMap As New Dictionary(Of Integer, HuffmanNode) 'takes a value and gives the leaf containing that value
'''<summary>Constructs the initial tree using the given frequency table</summary>
Public Sub New(ByVal freqTable As UInteger())
'leafs
For i As Integer = 0 To freqTable.Length - 1
If freqTable(i) > 0 Then
insert(New HuffmanNode(i, freqTable(i)))
End If
Next i
'special leafs
insert(New HuffmanNode(&H100, 1)) '[end of stream]
insert(New HuffmanNode(&H101, 1)) '[new value follows]
'tree
For i As Integer = nodes.Count - 1 To 1 Step -1
insert(New HuffmanNode(nodes(i), nodes(i - 1)))
Next i 'decrementing by 1 is enough because the new node shifted the list
End Sub
'''<summary>Adds the node to the sorted list of nodes, and adds leafs to the value map</summary>
Private Sub insert(ByVal n As HuffmanNode)
If n.val <> -1 Then leafMap(n.val) = n
For i As Integer = 0 To nodes.Count - 1
If n.freq > nodes(i).freq Then
nodes.Insert(i, n)
Return
End If
Next i
nodes.Add(n)
End Sub
'''<summary>Increases the frequency of the given value, and updates the tree to maintain optimality</summary>
Public Sub increase(ByVal val As Integer)
Dim n As HuffmanNode
'Create a new node for the value if it isn't even in the tree
If Not leafMap.ContainsKey(val) Then
'add the new value node by pairing it with the lowest frequency node
'[this transformation maintains the optimality of the tree]
n = New HuffmanNode(val, 0)
leafMap(val) = n
Dim m As HuffmanNode = nodes(nodes.Count - 1) 'sibling
Dim g As HuffmanNode = m.parent 'grandparent
Dim p As New HuffmanNode(n, m) 'parent
p.parent = g
If g.rightChild Is m Then g.rightChild = p Else g.leftChild = p
nodes(nodes.Count - 1) = p
nodes.Add(m)
nodes.Add(n)
End If
'Get the leaf with the value to increase
n = leafMap(val)
'Increase the frequency of the leaf and its ancestors and restructure the tree to match
While n IsNot Nothing
n.freq += CUInt(1)
'find the new position the node must occupy in the ordered list
Dim i As Integer = nodes.IndexOf(n) - 1
While i >= 0 AndAlso nodes(i).freq < n.freq
i -= 1
End While
i += 1
'if the node has to change positions, we need to update the tree and the list
Dim m As HuffmanNode = nodes(i)
If m IsNot n Then
'switch places in list
nodes(nodes.IndexOf(n)) = m
nodes(i) = n
'switch subtrees rooted at n and m
'[note that m has the same frequency as n used to have, so m's new ancestors will not need to be updated]
Dim t As HuffmanNode
If m.parent Is n.parent Then
t = m.parent.rightChild
m.parent.rightChild = m.parent.leftChild
m.parent.leftChild = t
Else
If m.parent IsNot Nothing Then If m.parent.rightChild Is m Then m.parent.rightChild = n Else m.parent.leftChild = n
If n.parent IsNot Nothing Then If n.parent.rightChild Is n Then n.parent.rightChild = m Else n.parent.leftChild = m
End If
t = m.parent
m.parent = n.parent
n.parent = t
End If
'repeat for ancestors
n = n.parent
End While
End Sub
End Class
Private tree As HuffmanTree = Nothing
Private isZeroTree As Boolean = False
Private numBytesDecompressed As Integer = 0
Private hitEnd As Boolean = False
Private curNode As HuffmanNode = Nothing
Private ReadOnly subBitBuffer As New BitBuffer() 'buffered coded bits
Private ReadOnly overBitBuffer As New BitBuffer() 'buffered decoded bits
Public Sub New(ByVal substream As IO.Stream, Optional ByVal mode As modes = modes.read)
MyBase.New(substream, mode)
End Sub
Public Overrides Property Position() As Long
Get
Return numBytesDecompressed
End Get
Set(ByVal value As Long)
Seek(value, IO.SeekOrigin.Begin)
End Set
End Property
Public Overrides Function estimateInputSize(ByVal outputSize As Integer) As Integer
estimateInputSize = CInt(outputSize * 0.8) + 1
If tree Is Nothing Then estimateInputSize += 1
End Function
Public Overrides Sub convert(ByVal readBuffer() As Byte, ByVal readOffset As Integer, ByRef readCount As Integer, _
ByVal writeBuffer() As Byte, ByVal writeOffset As Integer, ByRef writeCount As Integer)
Dim maxWriteCount As Integer = writeCount
Dim maxReadCount As Integer = readCount
writeCount = 0
readCount = 0
If hitEnd Then Return
'first byte is the index of the frequency table used to build the tree
If tree Is Nothing Then
If readCount >= maxReadCount Then Return
Dim frequencyTableIndex As Byte = readBuffer(readOffset + readCount)
readCount += 1
isZeroTree = frequencyTableIndex = 0
If frequencyTableIndex >= frequencyTables.Length Then Throw New IO.IOException("Error decompressing Huffman Stream: invalid frequency index")
tree = New HuffmanTree(frequencyTables(frequencyTableIndex))
End If
'decompress
While writeCount < maxWriteCount
Do While overBitBuffer.numBits >= 8
writeBuffer(writeOffset + writeCount) = overBitBuffer.takeByte()
writeCount += 1
If writeCount >= maxWriteCount Then Exit While
Loop
'travel to leaf node
If curNode Is Nothing Then curNode = tree.nodes(0) 'root
Do While curNode.val = -1
If subBitBuffer.numBits <= 0 Then
If readCount >= maxReadCount Then Exit While
subBitBuffer.queueByte(readBuffer(readOffset + readCount))
readCount += 1
End If
curNode = switch(subBitBuffer.takeBit(), curNode.rightChild, curNode.leftChild)
Loop
'interpret leaf value
Select Case curNode.val
Case &H100 'end of stream
hitEnd = True
Exit While
Case &H101 'new value
If readCount >= maxReadCount Then Exit While
subBitBuffer.queueByte(readBuffer(readOffset + readCount))
readCount += 1
Dim b As Byte = subBitBuffer.takeByte()
overBitBuffer.queueByte(b)
tree.increase(b)
If Not isZeroTree Then tree.increase(b)
Case Else
overBitBuffer.queueByte(CByte(curNode.val))
If isZeroTree Then tree.increase(curNode.val)
End Select
curNode = Nothing
End While
numBytesDecompressed += writeCount
End Sub
End Class |
| 08-08-2008, 02:29 AM | #13 |
Bah, VB... you should code in C. Looks good though. |
