Going faster than a-rate?

Stefan_Mayer · 29 March 2022 23:37

Hi again,

the farther I dig into csound, the faster my confusion about it grows - at a-rate, or wait, am I still in the init-pass? by now I’m almost certain that I will make a fool of myself by asking questions but hey, I’m still convinced that learning csound is worth it!

Can I use an opcode, like “pitch” for example, that works on an a-rate input and make it process faster than a-rate? As far as I understand it now most of csound’s opcodes are designed to work in “real-time”, that is at “a-rate”.

What if I needed to scan the pitch of a 3 minute long audio sample at intervals of 50ms “at init time” (and store the pitch-information in a table / array for later use), so before any sound gets generated? It looks to me that this requires to build an instrument that “runs the audio-sample-data at a-rate” in order for the pitch-opcode to be able to operate, meaning that the pitch-scanning alone would take 3 minutes. I would require this scanning phase to be as short as possible, faster than a-rate. I assume theoretically it could be done in a couple of seconds. I also had a look at the pvs opcodes and the principle seems to be the same.

Just to make clear, I understand that the init-pass is generally the place where one pre-calculates stuff, sort of “independently” of the k- and a-cycles. My question is really about using opcodes there (I believe).

Can someone give me a hint? Am I missing something or am I trying to use csound for something it’s not meant for? In which case I assume it’s time for python (for handling the “pre-processing” and then call csound via the api for playback).

Any thoughts, warnings, suggestions, hints will be highly appreciated!

Kind regards,
stefan

rorywalsh · 30 March 2022 09:31

You can run over your sound file at i-time using a standard while loop. In pseudo code:

iCnt init 0
iLen = ftlen(iTableWithSoundFile)
while iCnt < iLen do
    //access samples using tab opcodes, and process accordingly
    iCnt += 1 
od

To make sure that the other instrument that need this information don’t start until this pre-processing is done, I would schedule them to run from this instrument rather than the score.

Stefan_Mayer · 30 March 2022 14:20

Thanks @rorywalsh for this blazing fast response and the example!

I do understand i-time to that extent, as (unclearly) expressed in the question
It’s more about opcodes that specifically take a-rate variables as input. I really need to work on the clarity of my questions in general…

So, if I understand you correctly, you’re suggesting that within such an i-time while loop (or however many are necessary) I can “fake” a-rate variables and feed them to opcodes that explicitly take aVars?

The pitch opcode, for example, has the following signature:

koct, kamp pitch asig, iupdte, ilo, ihi, idbthresh

So I’ll somehow fill up my asig with values at i-time?
Somehow, after all I read about csound so far, I assumed that that’s not possible (which led me to the question). But if you say so, I will give this another shot!

A big pile of thanks again!

rorywalsh · 30 March 2022 14:25

Ah, I see what you mean now. No, in this case it is not possible. Plus many Csound opcodes have an internal state, which means calling them in a loop gives strange results. I had to do something like this in the past, where an analysis had to be done first. But I had to do it in two calls to Csound. Meaning I couldn’t do in Cabbage - as a single plugin. Oh, wait, I I guess I could have if I used the python opcodes but I never bothered with them. And I’m not sure for how much longer they will be supported or maintained.

Stefan_Mayer · 30 March 2022 20:25

ok, that’s enlightening and confirms my suspicions. thank you!!

so, the options are either

going the “csound-inside-python”-route or the “python-inside-csound”-route, while the latter seems a bit problem-prone due to the python 2.7 / maintanance issue. however I read in the floss manual that there’s “a plugin to port the Python opcodes to Python 3” and there seems to be a way to “embed ctcsound in the Python opcodes.” I still have to check this out.

In both cases it won’t run once exported as a plugin from cabbage, is that correct?

In that case, i.e. not having the benefits of using your csound program as a vst-plugin, is there a real advantage in using csound as the audio-engine inside a python program? I assume there are good, maybe more “modern” (in terms of syntax, audio handling etc.) audio-libraries for python that are just as efficient. Would it be fair to say that in these days it’s mostly a matter of taste?

writing opcodes for audio feature extraction that work at i-time or looking into this:

http://git.1bpm.net/csound-xtract/about/

seems to be in alpha but who knows, maybe it’s working already!

rorywalsh · 30 March 2022 21:13

You can run Python opcodes within Cabbage, but some hosts don’t like it, and may blacklist the plugin.

I’d say yes. I’ve used Csound with Python and it’s incredibly powerful. The two make a formidable resource tool, but perhaps not that accessible, certainly within a DAW.

Let’s ping the author @richardk and see what he says

richardk · 31 March 2022 02:11

If I am following correctly, it is actually possible to do this kind of faster-than-realtime / preparatory audio processing/analysis in Csound alone.

The pvsbufread manual page has a very interesting example that uses some trickery to perform audio operations faster than actual audio output, in a single k-pass. I now use it a lot for non-realtime analysis which can then inform realtime processes in the same performance.

I have an example here and attached pitchtrack.csd , which uses the file test.wav but you can change it accordingly).

Basically this example uses the pitch opcode to store analysis data in a ftable at specified period intervals (0.1s) and then calls a couple of other instruments which demonstrate usage of the stored data including a rough resynthesis and printing the frequency values. On my computer it performs analysis of the 18s test file in 0.65s before running the other realtime instruments.

Recently I wrote another example using a similar technique as a response to something on the Csound mailing list. That is here and attached analysis_example.csd. It does some more types of analysis in a time-segmented manner based on amplitude thresholds, but doesn’t store in a f-table.

The same technique is used for some MFCC corpus based concatenative resynthesis stuff I’ve been working on, here . Currently still being refined but the relevant part is that it uses the mfb opcode and stores analysis data vectors in a ftable, so could be worth a look for more involved ftable storage ie multiple analysis parameters.
Currently extending this with audio descriptors inspired by Øyvind Brandtsegg’s excellent feature extraction work which is complicated but worth having a look at if you are interested in feature extraction.

The csound-xtract plugin is certainly still in some kind of alpha/beta , but everything there does seem to fundamentally function OK! The main development work required is really just to extend the features obtained from libxtract.

richardk · 31 March 2022 01:55

PYO is a very nice audio package for Python , maybe the best featured. However, in my experience the efficiency isn’t good enough (tried porting some quite heavy live performance stuff to it, had to abandon).

rorywalsh · 31 March 2022 08:05

Thanks for this info Richard. Invaluable!

Stefan_Mayer · 2 April 2022 06:16

Thanks for the correction. That’s really good to know.

I would not have expected to this to work…

Stefan_Mayer · 2 April 2022 10:26

@richardk I wish I could express my appreciation for this post. This contains everything that I have been trying to find out for a couple of weeks now, and more.

richardk:

If I am following correctly, it is actually possible to do this kind of faster-than-realtime / preparatory audio processing/analysis in Csound alone.

The pvsbufread manual page has a very interesting example that uses some trickery to perform audio operations faster than actual audio output, in a single k-pass. I now use it a lot for non-realtime analysis which can then inform realtime processes in the same performance.

I have an example here and attached pitchtrack.csd , which uses the file test.wav but you can change it accordingly).

Basically this example uses the pitch opcode to store analysis data in a ftable at specified period intervals (0.1s) and then calls a couple of other instruments which demonstrate usage of the stored data including a rough resynthesis and printing the frequency values. On my computer it performs analysis of the 18s test file in 0.65s before running the other realtime instruments.

I studied the example from the pvsbufread manual-entry and the one you attached which is using the pitch opcode. This is precisely the kind of “trickery” I was thinking of, but could have never (never say never) implemented in csound myself.

richardk:

Recently I wrote another example using a similar technique as a response to something on the Csound mailing list. That is here and attached analysis_example.csd. It does some more types of analysis in a time-segmented manner based on amplitude thresholds, but doesn’t store in a f-table.

The same technique is used for some MFCC corpus based concatenative resynthesis stuff I’ve been working on, here . Currently still being refined but the relevant part is that it uses the mfb opcode and stores analysis data vectors in a ftable, so could be worth a look for more involved ftable storage ie multiple analysis parameters.
Currently extending this with audio descriptors inspired by Øyvind Brandtsegg’s excellent feature extraction work which is complicated but worth having a look at if you are interested in feature extraction.

I will have a closer look at the more advanced examples. I’m especially looking forward to see how the “similarity matching” is done (given this is at all that kind of “target-driven” concatenative synthesis method I think it is. It still takes me ages to read a more complex csound program.)

I’m also really excited to experiment with this. Thank you, again.

Kind regards,
stefan

richardk · 2 April 2022 14:01

Great - happy to be of assistance @stefan_mayer !

To outline the similarity matching, basically the 16 bands of MFCC data are written to a ftable every sr/fftsize seconds , and the table is created large enough to store that for the analysis length. Then when matching, every sr/fftsize seconds the input MFCC is compared with all of the values in the ftable using a Euclidean distance opcode (ie, so it goes through in a loop with index + 16 in each loop iteration). This is then used to prioritise which is the ‘best’ match by having the smallest distance.
There are different (/ better?) ways to do this, with some kind of ordering of the features, so the entire table wouldn’t necessarily need to be scanned to find the smallest distance, but I think it is a bit hard to do with MFCC. I’m working on enhancing this and using some other feature algorithms, but the general idea is there. Also this is on a per-FFT-window basis, so for matching longer times I believe dynamic time warping is usually used to ensure more accurate results. In another project I’m working on, just using the mean of MFCCs over the sound duration seems to provide reasonable results.

I have a prebuilt Windows binary of csound-xtract at the bottom of this page, if any use.

With both, happy for any feedback/improvement suggestions!