These are (more or less) the steps to take:

 - with VirtualDub:

1) Pick and prepare (if required, e.g. crop to a useful length) your chosen video.

2) Load this up in VirtualDub

3) Load the vdscript for the conversion

4) Run "save as avi..." from the File menu. I choose a consistent filename stub, <title>_<region>, e.g. "tron_ntsc", so the file is "tron_ntsc.avi".

5) Now load the avi produced (File/Open video file...)

6) From the file menu, choose Export/Raw Audio.... I append the stub with "_au" and no extension, e.g. "tron_ntsc_au"

7) From the file menu, choose Export/Raw Video.... I append the stub with "_au" and no extension, e.g. "tron_ntsc_av"
Format: 4:2:0 YCbCr planar, R.60.1, 8-bit, Y:16-235 centred

Secondary plane order: Cb/Cr (V/U)

Scanline alignment: 4 bytes

Vertical orientation: Top-down

8 ) From a command prompt, encode the audio: "encaudio60 ntsc_au < tron_ntsc_au"
9) Then encode the video: "encvideo60n ntsc_av < tron_ntsc_av"
10) Combine the to files: "mux60n ntsc_av ntsc_au tron_ntsc.bin"
 
You now have the hard drive image that can be mounted as a Side 2 Hard Disk within Altirra.
From the "File/Boot Image...", load the "movplay60n.obx" and the video should play.

-----

- with MEncoder:

#PAL conversion

mencoder -nosound -of rawvideo -ovc raw -vf hue=0:1,scale=160:192,format=yv12,harddup,swapuv -sws 6 -ofps 49.86 atari_v5_2.mp4 -o atari_v5-pal.raw
#mplayer -quiet  -ao pcm:waveheader:file=atari_v5-pal.wav -vc dummy -vo null -channels 1
sox atari_v5.wav -C 0.5 -c 1 -b 8 -r 15558 atari_v5-pal.u8 gain -l 10
./encvideo50n <atari_v5-pal.raw atari_v5-pal.mov
./encaudio60n <atari_v5-pal.u8 atari_v5-pal.aud
./mux50n atari_v5-pal.mov atari_v5-pal.aud atari_v5-pal.bin
dd bs=512 seek=17 if=atari_v5-pal.bin of=/media/sda2/Emulators/Altirra/atari_v5-pal.dd
 

PAL conversion is usually commented out as I have only NTSC Ataris.

 

In the meconder command the second hue value is usually 2 for NTSC videos 
and only 1 for PAL.  NTSC Ataris need help with color saturation.

 

The second audio extraction is commented out because the .WAV is the same 
for either conversion

 

For the first run I usually comment everything out below the sox command.   
I want to see the output and see if there is any clipping.   
If the content is non-musical then I typically allow clipping occurences up to 3-5.  
Otherwise, no clipping allowed.  

 

If there is no clipping I push the gain parameter higher and run sox again. 
Repeat until I find a maximum.

 

Some audio is just too dynamic.  By the time gain is lowered until there is 
no clipping the volume of converted video is just too low or full of noise.  
I'm not sure where the noise comes from.  At this point I have to edit the 
audio file.  I use audacity.

 

The use of dd in this example just offsets the video by Phaeron's required 
16 sectors.   With an image file the seek value is 17.   Going to a real CF 
card I think it's 16. 

 

dd is kind of slow for making an image file.  It would probably be better to 
do something like cat 16_byte_padding converted_video_file > CF_image_file    ; 
image would be truncated but that's OK.  It's just the end of the video not the 
end of the CF card.

-----

