Cabbage Logo
Back to Cabbage Site

Memory usage with example delay UDO from Csound book

This is the basic circular delay example from the Csound 6 book. It is easy to understand, but seems incredibly inefficient based on RAM usage. For my purposes (I’ve extended this quite a bit …), this approach was very attractive, but the RAM usage is crippling.

<Cabbage>
form caption("Untitled") size(400, 300), guiMode("queue") pluginId("def1")
rslider bounds(296, 162, 100, 100), channel("gain"), range(0, 1, 0, 1, .01), text("Gain"), trackerColour("lime"), outlineColour(0, 0, 0, 50), textColour("black")

</Cabbage>
<CsoundSynthesizer>
<CsOptions>
-n -d
</CsOptions>
<CsInstruments>
; Initialize the global variables. 
ksmps = 32
nchnls = 2
0dbfs = 1


opcode Delay,a,ai
setksmps 1
asig, idel xin
	kpos init 0
	isize = idel > 1/sr ? round(idel*sr) : 1
	adelay[] init isize
	xout adelay[kpos]
	adelay[kpos] = asig
	kpos = kpos == isize-1 ? 0 : kpos + 1
endop


instr 1
kGain cabbageGetValue "gain"

a1 inch 1

a_delay_out Delay, a1, 60

outs (a1 + a_delay_out) * kGain, (a1 + a_delay_out) * kGain
endin

</CsInstruments>
<CsScore>
;causes Csound to run for about 7000 years...
f0 z
;starts instrument 1 and runs it for a week
i1 0 [60*60*24*7] 
</CsScore>
</CsoundSynthesizer>

I realize that I’ve set the delay to 60 seconds … but why the heck is Cabbage using 780 MB of RAM? I think this should be 8.23 MB of data (one minute of 24 bit, 48KHz audio). What am I missing?

Thank you!

I’ve built a few other versions for benchmarking.

Adapted from the FLOSS manual, in the instrument. Cabbage is using about 50 MB of RAM

<Cabbage>
form caption("Untitled") size(400, 300), guiMode("queue") pluginId("def1")
rslider bounds(296, 162, 100, 100), channel("gain"), range(0, 1, 0, 1, .01), text("Gain"), trackerColour("lime"), outlineColour(0, 0, 0, 50), textColour("black")

</Cabbage>
<CsoundSynthesizer>
<CsOptions>
-n -d
</CsOptions>
<CsInstruments>
; Initialize the global variables. 
ksmps = 32
nchnls = 2
0dbfs = 1


instr 1
kGain cabbageGetValue "gain"

a1 inch 1

;; 60 second delay
idel_size = 60 * sr
kdelay_line[] init idel_size
kread_ptr init 1
kwrite_ptr init 0

  kindx = 0
  while (kindx < ksmps) do
    kdelay_line[kwrite_ptr] = a1[kindx]
    adel[kindx] = kdelay_line[kread_ptr]

    kwrite_ptr = (kwrite_ptr + 1) % idel_size
    kread_ptr = (kread_ptr + 1) % idel_size

    kindx += 1
  od

outs (a1 + adel)*kGain, (a1 + adel)*kGain
endin

</CsInstruments>
<CsScore>
;causes Csound to run for about 7000 years...
f0 z
;starts instrument 1 and runs it for a week
i1 0 [60*60*24*7] 
</CsScore>
</CsoundSynthesizer>

Here is the FLOSS code, but I’ve moved it to a UDO. Cabbage uses about 50 MB of RAM with this approach, even with six calls to the UDO. Twenty instances used about 230 MB of RAM.

<Cabbage>
form caption("Untitled") size(400, 300), guiMode("queue") pluginId("def1")
rslider bounds(296, 162, 100, 100), channel("gain"), range(0, 1, 0, 1, .01), text("Gain"), trackerColour("lime"), outlineColour(0, 0, 0, 50), textColour("black")

</Cabbage>
<CsoundSynthesizer>
<CsOptions>
-n -d
</CsOptions>
<CsInstruments>
; Initialize the global variables. 
ksmps = 32
nchnls = 2
0dbfs = 1


opcode DelayFloss, a, ai

setksmps 32
	asig, idel xin
	
	idel_size = idel * sr
	kdelay_line[] init idel_size
	kread_ptr init 1
	kwrite_ptr init 0

  	kindx = 0
  	while (kindx < ksmps) do
    	kdelay_line[kwrite_ptr] = asig[kindx]
    	adel[kindx] = kdelay_line[kread_ptr]

    	kwrite_ptr = (kwrite_ptr + 1) % idel_size
    	kread_ptr = (kread_ptr + 1) % idel_size

    	kindx += 1
  	od

	xout adel

endop


instr 1
kGain cabbageGetValue "gain"

a1 inch 1

a1_delay DelayFloss, a1, 60


outs (a1 + a1_delay)*kGain, (a1 + a1_delay)*kGain
endin

</CsInstruments>
<CsScore>
;causes Csound to run for about 7000 years...
f0 z
;starts instrument 1 and runs it for a week
i1 0 [60*60*24*7] 
</CsScore>
</CsoundSynthesizer>

My best guess is that Csound makes (ksmps/setksmps = 32) copies of the opcode and the needed memory in the first case.

What I’m ultimately trying to achieve is a sample-accurate delay line that:

  • gives me flexibility for moving the read and write pointers independently
  • allows me to process the delay line’s feedback
  • allows me to have multiple copies of it running simultaneously

I was able to build that off of the example in the first post (code not shown), but it won’t be usable if I can’t get the RAM under control.

I was almost there with extensions to the Floss version, but cannot figure out how to process the delay line’s feedback (for example, applying butterlp to an input that’s from an array - it seems to want an a variable).

I have another version that uses the delayr and delayw opcodes, but found it doesn’t give me enough control over the read/write pointers.

Frustrating!

Sorry for the delay in getting back to you. Have you tried a vanilla version outside of Cabbage? Also, the IDE adds some overhead, but these are stripped back when running the instrument as a plugin. Have you tried running it in a host? Reaper has a performance meter which also shows RAM usage. I suspect things will be better there.

Also, have you tried with a function table instead of an array?

I set ksmps in my main file to 48 using the code from my first post. RAM usage went from 770 MB (kspms of 32) to 1.1 GB. At ksmps=64 it becomes 1.43 GB. However, that does not happen when changing the ksmps in my other implementations. Hmm.

I’m pretty sure something is incorrect with that implementation of the circular buffer, but I’m not seeing it. That code is taken straight from the 2016 “Csound” book. The most obvious difference between the “Csound” version and “Floss” version is the use of an a-rate array vs. a k-rate array. I guess it could be something about how a-rate arrays are handled.

I haven’t moved it to a plugin yet. I did try in Cabbage on my PC with the same memory-usage result. I’ll try it in Logic, though I really don’t think the IDE is the issue.

I’ll try a function table.

Here’s the function table version:

<Cabbage>
form caption("Untitled") size(400, 300), guiMode("queue") pluginId("def1")
rslider bounds(296, 162, 100, 100), channel("gain"), range(0, 1, 0, 1, .01), text("Gain"), trackerColour("lime"), outlineColour(0, 0, 0, 50), textColour("black")

</Cabbage>
<CsoundSynthesizer>
<CsOptions>
-n -d
</CsOptions>
<CsInstruments>
; Initialize the global variables. 
ksmps = 32
nchnls = 2
0dbfs = 1


opcode Delay, a, ai
    setksmps 1
    asig, idel xin
    apos init 0
    isize = idel > 1/sr ? round(idel * sr) : 1

    ; Create a function table to store the delay line
    ift = ftgen(0, 0, isize, 2, 0) ; Create a table of size 'isize' filled with zeros

    ; Read the delayed sample
    adelayed tablei apos, ift

    ; Write the current sample into the table
    tablew asig, apos, ift

    ; Update the position counter
    apos = (apos + 1)%isize

    xout adelayed
endop




instr 1
kGain cabbageGetValue "gain"

a1 inch 1

a_delay_out1 Delay, a1, 60

outs (a1 + a_delay_out1) * kGain, (a1 + a_delay_out1) * kGain
endin

</CsInstruments>
<CsScore>
;causes Csound to run for about 7000 years...
f0 z
;starts instrument 1 and runs it for a week
i1 0 [60*60*24*7] 
</CsScore>
</CsoundSynthesizer>

Performance seems comparable to the Floss version, so that’s really promising!

I don’t get why a table and a one-dimensional array have such different overhead, but that seems to be the case. Thank you for the suggestion!

I moved the ftable version to my full project. With four 60-second loops, I’m seeing 90 MB of RAM use. That seems totally reasonable. Awesome!

I just tried your UDO version with the arrays and I don’t see any crazy memory usage. In fact, it comes in at a little over 50mbs :thinking: Oh but perhaps you’re using a higher sampling rate than I am.

Did you use the version in the first post with setksmps 1 (opcode Delay)? The first version is the one that uses tons of RAM.

The “Delay” version uses over 700 MB of RAM on both my Mac and my Windows machine. If you’re getting good performance, I’d love to understand what is going on.

I think I’m at 24 bit and 48 kHz, though I don’t know what happens to the bit depth internally.

I’m used to seeing this in the Csound logger:
sr = 48000.0, kr = 1500.000, ksmps = 32

That’s what I see when running the memory-hogging version of the plugin.

I think I’m close to understanding what’s going on. There’s a comment in the Floss manual that the index for an a-rate array indicates the audio signal:

Blockquote
An audio array is a collection of audio signals. The size (length) of the audio array denotes the number of audio signals which are hold in it. In the next example, the audio array is created for two audio signals:

aArr[] init 2

I think Csound is basically prepped to update every element of my array (60*sr long!) at every sample, even though I’m only updating one element in the entire array.

Long story short: I think ftables are absolutely the way to go here.

isize = idel > 1/sr ? round(idel*sr) : 1
adelay[] init isize

Yes, that’s it. This will create isize audio vectors. Ouch :rofl: That’s not the first time I’ve failed to spot this, it happened in one of my own instruments recently :man_facepalming:

p.s. just to confirm, the first one does indeed screw with me RAM :slight_smile:

1 Like

Great! Thank you!

Kind of crazy that this is the circular buffer delay example in the 2016 book, but I’m really glad to have figured it out.

It’s clearly a typo :man_shrugging: And most people would never notice an issue with it. Nice spot.

I realized that the ftable approach was leading to a “click” at the end of the table. With a buffer of 60 seconds, it wasn’t a big deal, but I was was testing with shorter buffers and it was going to be an issue.

I ended up switching back to the Floss version as a UDO. Knowing what I know now, I was able to keep it at k-rate, but use setksmps to 1, which effectively makes it a-rate. And since k-indexed arrays work the way I expected, there are no crazy RAM issues.

What a journey. :slight_smile:

1 Like

Some closure on this: Apparently this is actually a bug in Csound 6. Switching to k-rate fixes it, but a-rate shouldn’t have had the issue in the UDO. Hopefully it’ll be fixed in Csound 7.

Obviously not a crippling bug for others since it has been around for so long.

1 Like