Multimedia dedicated weblog.

How to make a DirectShow Muxer Filter - Part 1

August 23rd, 2008 Posted in DirectShow

If you have enjoyed my previous DirectShow articles I hope you will also find this tutorial useful. This one is going to be a bit more advanced but I will try to make everything clear so if you know what threads are you should not have any problems understanding the topic.


It is a fact that the muxer and splitter type of filters are much more complex than for instance transforms and unluckily there is not much information about them in the MSDN either so let’s start with some basic facts first.

In general the muxer filter is a filter that accepts multiple streams at the input, transforms the incomming data in some way and produces a final stream of raw data that is usually meant either for storage or for transmission. Depending on the format used the muxing scenarios can be of different complexity. The simple cases such as a one-video-stream MPEG program stream could easily be implemented as a transform filter that just inserts packet headers at proper places while the more complex cases such as the fully interleaved multiple-track MP4 file might require proper stream synchronisation and additional adjustments to the final file after the muxing procedure is over. When working with DirectShow one must also keep in mind that the incomming data is usually delivered in multiple threads and great attention must be paid to keep everything thread-safe and to avoid deadlocks.

In this tutorial we will try to implement the more complex scenario of a multiple-input multiplexer with its own interleaver and output thread. We will create a nice baseclass for the muxer kind of filters and then implement a very simple flash video multiplexer that will take advantage of the mentioned baseclass.

The main idea

Fig. 1. Muxer schema

As you can see on the figure 1, our muxer will contain several input pins, one output pin, one interleaver unit with buffer queues for each input pin and one muxing thread that will read interleaved samples from the interleaver and deliver them downstream. If the muxer was supposed to be used only for live streams it might also use the reference clock to block input streams to synchronize incomming data. Since this is not out case we will use the interleaver to keep incomming streams synchronized but also allow for full-speed processing. The point of buffer queues for each input stream is to allow upstream filters to process a few samples as fast as possible so when more complex frame is encountered and encoding takes longer time the graph would not behave skippy. Once the queue is full it will block and will not allow upstream filter to deliver more data until the muxing thread reads some samples first. This way all the incomming streams might achieve nearly the same speed when delivering data. The goal of the muxing thread is to wait until there is at least one sample available for all incomming streams (those that are not finished yet) and then find the sample with the lowest timestamp and read it out of the queue. If the queue was full, this action would mark it as ready-to-write and the upstream filter might deliver data again. If all incomming streams are finished, we will deliver an End-Of-Stream notification downstream which will eventually complete the graph.

I might note that if the muxer was supposed to operate with very high datarates (hundreds of mbps) there might be a little slow-down caused by extra memory copying of incomming samples into the queue buffers. This might be solved by using own allocator that would provide IMediaSamples for the upstream filter to put data into and these buffers would directly be used to form the queue so no extra copying would be necessary. But for now we would just be fine with the simple solution.

The muxer base class

To make our nice little idea work in the real life we’re going to implement a basic base class for the muxer-type of filters. The point is to separate common concerns into a versatile base class and then derive a specific purpose muxer class that will only do the "real work".

Our base class should take care of the following:

  • Dynamically create input pins as they are needed
  • Query the output peer pin for IStream interface so that we can perform random-access IO operations on the output file
  • Interleave incomming samples
  • Block upstream filters to make them deliver data at the same speed
  • Provide a muxing thread that reads interleaved samples and calls derived class to do something with them
  • Handle state changes and call virtual methods - OnStartStreaming and OnStopStreaming

Fig. 2. CBaseMux classes

Output pin and dynamic input pins

When handling pins we will do the same as other filter base classes do. Both input and output pins will forward CheckMediaType, SetMediaType, GetMediaType, BreakConnect and CompleteConnect calls to the parent filter class. This will allow us to decide what types to allow and what types to reject all in one place.

We have only one request on the output pin’s peer - it must be able to expose IStream interface so we might use random access when writing output file. If you do not need random access you might also use the classic delivery mechanism - delivering IMediaSamples downstream. You should keep in mind that when connected to the File Writer filter, timestamp values have the meaning of the absolute byte position in the file. Our base class by default negotiates for samples of maximum size of 512 KB. I believe you won’t need to touch this.

int CBaseMux::AddPin()
{
    CBaseMuxInputPin    *pin = NULL;
    WCHAR                    name[1024];

    // prepare a name for the pin
    _swprintf(name, L"In %d", (int)pins.size());

    int ret = CreatePin(&pin, name);
    if (ret < 0) return -1;

    // and append it into the list
    pin->index = (int)pins.size();
    pins.push_back(pin);

    return 0;
}

HRESULT CBaseMux::CompleteInputConnect(CBaseMuxInputPin *pin, IPin *pReceivePin)
{
    // if all pins are connected, we should add a new one
    CAutoLock        lck(&lock_filter);

    if (AllConnected()) AddPin();
    return NOERROR;
}

Fig. 3. Dynamic creation of input pins

Our base class takes care of dynamic input pins and creates new pins as they are needed. This is being done in the AddPin method. To make the muxer use your provided pin class you might need to override the CreatePin method.

int CBaseMux::CreatePin(CBaseMuxInputPin **pin, LPCWSTR name)
{
    // the derived class migh need to override this method
    // to provide custom pins

    if (!pin) return -1;

    HRESULT                hr = NOERROR;
    CBaseMuxInputPin    *new_pin = new CBaseMuxInputPin(L"MuxPin", this, &hr, name);

    if (!new_pin) {
        *pin = NULL;
        return -1;
    }

    *pin = new_pin;
    return 0;
}

Fig. 4. CreatePin method

Interleaver

The next important component of the muxer base class is the interleaver subsystem. The interleaver must be configured when the filter changes to active state. At this point the base class scans through all input pins and creates interleaver streams for those that are connected. When the filter goes inactive the interleaver will be flushed and all streams will be destroyed.

void CBaseMux::StartStreaming()
{
    // scan through the pins
    for (unsigned int i=0; i<pins.size(); i++) {
        CBaseMuxInputPin    *pin = pins[i];

        // by default there is no stream associated with the pin
        pin->stream = NULL;

        if (pin->IsConnected()) {

            CMuxInterStream  *stream = NULL;
            CMediaType          &mt = pin->CurrentMediaType();
            int                        ret;

            // append the new stream
            ret = interleaver.AddStream(&stream);
            if (ret == 0) {
                pin->stream = stream;
                pin->stream->active = true;
                pin->stream->data = (void*)pin;   // associate it with the pin

                // ask the derived class if this stream should be interleaved
                pin->stream->is_interleaved = IsInterleaved(&mt);
            }
        }
    }

    // reset the output byte counter
    bytes_written = 0;

    // and then let the derived class do something about it
    OnStartStreaming();
}

Fig. 5. Interleaver configuration

The most imporant methods of the interleaver are Write, Read and GetPacket. When new samples are received by one of the input pins, an instance of CMuxInterPacket must be obtained by calling CMuxInterleaver::GetPacket. The default implementation simply creates a new instance of the packet. If you’d need to implement a sort of packet pool you might need to override this method. The Write method simply inserts the given packet instance into the queue for the stream the packet belongs to.

int CBaseMux::ReceivePacket(CBaseMuxInputPin *pin, IMediaSample *sample)
{
    // the pin must have a valid stream
    if (!pin->stream) return -1;

    // now try to make an interleaver-packet out of it
    CMuxInterPacket    *packet;
    int                         ret;

    HANDLE            ev[] = { pin->stream->ev_can_write, interleaver.ev_abort };
    DWORD            dw;

    // wait until the stream is writable, or we’re being aborted
    while (true) {

        // if the pin is flushing - we’re done
        if (pin->IsFlushing()) return -1;

        dw = WaitForMultipleObjects(2, ev, FALSE, 20);
        if (abort) return -1;
        if (dw == WAIT_TIMEOUT) continue;
        if (dw == WAIT_OBJECT_0) break;

        // it must have been an abort event
        return -1;
    }

    ret = interleaver.GetPacket(&packet, pin->stream->index);
    if (ret < 0) return -1;

    // load the packet and interleave it
    packet->LoadFrom(sample);

    // not enough memory ?
    if (packet->data == NULL) {
        delete packet;
        return -1;
    }

    // bye bye
    ret = interleaver.Write(packet);
    if (ret < 0) {
        delete packet;
        return -1;
    }

    return 0;
}

Fig. 6. Packet reception

Muxing thread

As of this moment the base class should be able to receive samples and queue them in the interleaver structures. Now we need to provide a mechanism to read samples out of the interleaver and do something meaningful with them. The muxing thread is a very simple one and all it does is asking the interleaver for some packets. Once a packet is read from the interleaver the thread will call the virtual OnMuxPacket and the derived class might have a chance to do something with the packet.

DWORD CBaseMux::ThreadProc()
{
    while (true) {
        int cmd = GetRequest();
        switch (cmd) {
        case CBaseMux::MUX_CMD_EXIT:         Reply(0); return -1;
        case CBaseMux::MUX_CMD_STOP:        Reply(0); break;
        case CBaseMux::MUX_CMD_RUN:
            {
                Reply(0);

                /**************************************************************
                **
                **    Muxing thread
                **
                ***************************************************************/

                bool        done = false;
                while (!done && !CheckRequest(NULL)) {

                    CMuxInterPacket     *packet = NULL;

                    int                          ret;

                    /*
                        0  - success
                        -1 - timeout
                        -2 - abort
                        -3 - EOS 
                    */
                    ret = interleaver.Read(&packet, 40);
                    switch (ret) {
                    case -2:        done = true; break;        // we’re being aborted
                    case -3:
                        { 
                            if (output && output->IsConnected()) {
                                output->DeliverEndOfStream();
                            } else {
                                // yell as much as we can - we’re done
                                NotifyEvent(EC_COMPLETE, 0, 0);
                            }
                            done = true;
                        }
                        break;
                    case 0:
                        {
                            // let’s do something with the packet
                            CMuxInterStream  *stream = interleaver.streams[packet->stream_no];
                            CBaseMuxInputPin *inpin = (CBaseMuxInputPin*)stream->data;

                            OnMuxPacket(packet, inpin); 
                        }
                        break
                    }

                    // get rid of the packet
                    if (packet) delete packet;
                }

            }
            break;
        default:
            { 
                Reply(-1);
            }
            break;
        }
    }

    return 0;
}

Fig. 7. Muxing thread

Conclusion - Part 1

Congratulations for reading so far. By now you should have a good idea of how a baseclass for a muxer might work. To get even a better understanding you might download and read through the whole code so you might be prepared for the second part of this tutorial - the real-life implementation.

Download baseclass : muxer-class.zip (8 KB)

Enjoy,

Igor

 

 

 

  1. 12 Responses to “How to make a DirectShow Muxer Filter - Part 1”

  2. By Mercury_22 on Sep 9, 2008

    How ’bout How to make a Splitter ?

  3. By Igor Janos on Sep 9, 2008

    Sure. Trying to catch up with the things. First I’d like to finish the muxer tutorial…

  4. By Denis Gorodetskiy on Sep 12, 2008

    Thank you very much for the example!!!

  5. By malik cisse on Nov 20, 2008

    Hi Igor,

    First of all thanks for your example.
    It’s hard to find something else.
    I have problems compiling it. Here is the error:
    basemux.cpp(508) : error C2664: ‘CBaseInputPin::CBaseInputPin’ : cannot convert parameter 1 from ‘LPCTSTR’ to ‘TCHAR *’

    Any idea?

    Any idea where to find a splitter filter example?

    Thanks, Malik

  6. By Igor Janos on Nov 20, 2008

    LPCTSTR is a “const TCHAR *” type. It should not matter much. In any case it’s strange that you’re getting errors.

    As for the splitter example I have an article in my mind but so far haven’t have time to write it. Try checking the blog in a few weeks.

  7. By malik cisse on Nov 26, 2008

    Thanks for the info.

    Malik

  8. By eznasi on Jun 2, 2009

    what about sync few (upto 10) graphs one to others?
    suppose we start a stream (Mpeg2). can we synch them together via the PTS of each?
    moreover - I do not see any timing clock (as it is appears in the audio renderer in Graphedit)

  9. By Igor Janos on Jun 7, 2009

    I do not quite understand. Can you be more specific ?

  10. By agedboy on Jul 4, 2009

    What about the splitter (demuxer)? That’s as tough as the muxer. And neither does it has any offical document.

    Expect you write anything about it.;-)

    By the way, you’ve provided an excellent MP4 muxer, that’s so great. Do you have made a MP4 demuxer yet?;-)

  11. By Igor Janos on Jul 4, 2009

    Hey. Well… sure I could write an article about the splitters but I’m totaly lacking any spare time right now :-\.

    - yes I have made a MP4 demuxer some time ago. It would need a little bit of polishing if it was to be released to public. :-\

  12. By Jamie Fenton on Jul 5, 2009

    Geriant Davis has good examples of both MP-4 muxers and demuxers over at his GDCL web page. (www.gdcl.co.uk) He also has a very liberal license for those wishing to modify them (and most everything else he has released) - essentially “leave my name on it but don’t sue me”.

    There is nothing more precious than seeing a DirectShow problem solved in two different ways, particularly by masters of the art with a deep understanding about what should or should not be changed with regards to standard practice. Comparing Igor and Geriant’s stuff reveals much wisdom of this sort.

    – Jamie

  1. 1 Trackback(s)

  2. Oct 4, 2008: RadScorpion’s blog » Blog Archive » How to make a DirectShow Muxer Filter - Part 2

Post a Comment