The BSE Engine ============== The following gives an outline of the conceptual approach to simulate flow-system behavior and carries details of the implementation along. Introductionary remarks: ------------------------ The BSE Engine simulates signal flow systems by mapping them onto a network of signal processing modules which are connected via signal streams. The signal streams are value quantized and time discrete, where floats are used to store the quantized values and the samples are taken at arbitrary but equidistant points in time. Also, the sampling periods are assumed to be synchronous for all nodes. In the public BSE C API, engine modules are exposed as BseModule structures, for the internal engine implementation, each BseModule is embedded in an EngineNode structure. Node Model: ----------- * a node has n_istreams input streams * a node has n_jstreams joint input streams facilities, that is, an unlimited number of output streams can be connected to each "joint" input stream * a node has n_ostreams output streams * all streams are equally value quantized as IEEE754 floats, usually within the range -1..1 * all streams are synchronously time discrete * the flow-system behavior can be iteratively approximated by calculating a node's output streams from its input streams * since all streams are equally time discrete, n output values for all output streams can be calculated from n input values at all input streams of a single network * some nodes always react delayed ("deferred" nodes) and can guarantee that they can always produce n output values ahead of receiving the corresponding n input values, with n>=1 * a node that has no output facilities (n_ostreams==0) or has none of its output streams connected and is flagged by the user as a "consumer node" must be processed Node Methods: ------------- ->process() This method specifies through one of its arguments the number of iterations the node has to perform, and therefore the number of values that are supplied in its stream input buffers and which have to be supplied in its stream output buffers. ->process_deferred() This method specifies the number of input values supplied and the number of output values that should be supplied. The number of input values may be smaller than the number of output values requested, in which case the node may return less output values than requested. ->reset() The purpose of this method is to reset local state kept in a node to the initial state it possessed before the first call to process() or process_deferred(). Node Relationships: ------------------- Node B is an "input" of node A if: * one of A's input streams is connected to one of B's output streams, or * node C is an "input" of A and B is an "input" of C Processing Order: ----------------- If node A has an input node B and A is not a deferred node, B has to be processed prior to processing A. Connection Cycles: ------------------ Nodes A and B "form a cycle" if A is an input to B and B is an input to A. Invalid Connections: -------------------- For nodes A and B (not necessarily distinct) which form a cycle, the connections that the cycle consists of are only valid if the following is true: (C is a deferred node) and (C==A or C==B or (if C is completely disconnected, the nodes A and B do not anymore form the cycle)) Implementation Notes ==================== * if a node is deferred, all output channels are delayed * independent leaf nodes (nodes that have no inputs) can be scheduled separately * nodes contained in a cycle have to be scheduled together Scheduling Algorithm -------------------- To schedule a consumer and its dependency nodes, schedule_query() it: Query and Schedule Node: * tag current node * ignore scheduled input nodes * schedule_query_node on untagged input nodes, then do one of: * schedule input node (if it has no cycles) * resolve all input nodes cycles and then schedule the input nodes cycle (if not self in cycle) * take over cycle dependencies from input node * a tagged node is added to the precondition list (opens new cycle) * own leaf level is MAX() of input node leaf-levels + 1 * untag node Resolving Cycles: * at each scheduling stage, eliminate immediate child from precondition list; once the list is empty the cycle is resolved * at least one node being eliminated has to be deferred for the cycle to be valid Scheduling: * nodes need to be processed in the order of leaf-level * within leaf-levels, processing is determined by a per-node processing costs hint (cheap, normal, expensive) Suspending and sample accurate activation ----------------------------------------- In music synthesis practice, large branches of a flow-graph are often unused, e.g. a certain branch might make up a logical voice which is muted for long times. To save processing power, nodes can be suspended and resumed at arbitrary times (not necessarily block boundaries). Being suspended causes a node's processing method to be skipped and its output buffers to be filled up with zeros. Ordinary connection/disconnection of nodes is not sufficient to fullfill this purpose, because connection changes can only happen at block boundaries, while sample accurate timing is required (for voice activation), and nodes which hold internal state usually need to be reset (by means of calling their reset() method) upon resumption. Suspending a node automatically suspends its input nodes unless they have outputs connected to other nodes which are not suspended. Due to the structural nature of connections within branches (connections may form cycles or rhomboids of nodes), two propagation passes are required to propagate the activation (suspend) time through a branch. In order to determine the activation time of a node for which an output node is known to have been suspended (this output node doesn't need to be directly connected), all directly connected output nodes need to be examined and must contain updated activation time stamps already. It is thus mandatory to maintain information about the validity of a node's activation time stamp which results in the requirement for a second recursion pass. Upon the first pass, all possibly affected nodes, these are all inputs of the node which changed activation time, are flagged to require updates of their activation time. During the second pass, a node's activation time is determined from the activation time of the nodes connected to its outputs (which may first need updating themselves). Various optimizations are possible during the two recursion passes, e.g. by allowing suspending only at block boundaries or by integrating one recursion pass with other recursive tasks on the graph (scheduling). Nodes should not be activated or suspended due to a cyclic connection to one of their outputs, so to determine the activation time of a node cyclic connection paths must be ignored. Virtual modules --------------- In order to ease implementations of object networks which facilitate the engine for actual calculation, virtual modules are supported. Virtual modules are not scheduled and thus don't consume processor cycles during calculation phase. Their input/output streams are mere reconnection points for streams of real processing modules. As virtual modules are not part of the active calculation schedule, flow and suspend jobs may not be queued on them, but input recursive suspend/activation propagation (and thus indirect resumption) propagates correctly across virtual modules (though not seperately along their input/output streams). To let activation time propagate per-stream, multiple virtual modules can be used. Deferred Node Implementation Considerations: -------------------------------------------- For deferred nodes, the number n specifying the amount of output values that are produced ahead of input can be considered mostly-fixed. That is, it's unlikely to change often and will do so only at block boundaries. Supporting n to be completely variable or considering it mostly fixed has certain implications. Here're the considerations that led to supporting a completely variable n for the implementation: n is block-boundary fixed: + for complex cycles (i.e. cycles that contain other cycles, "subcycles"), the subcycles can be scheduled separately if the n of the subcycle is >= block_size - if n is the only thing that changed at a block-boundary, rescheduling the flow-graph is required in the cases where n = old_n + x with old_n < block_size or if x < 0 - deferred nodes can not change their delay in response to values of an input stream n is variable for every iteration step: + no rescheduling is required if n changes at block-boundary - subcycles can not be scheduled separately from their outermost cycle + the delay of deferred nodes can correlate to an input stream Threads, communication, main loops ================================== Thread types: * UserThread; for the scope of the engine (the functions exposed in bseengine.h), only one user thread may execute API functions at a time. i.e. if more than one user thread need to call engine API functions, the user has to take measures to avoid concurrency in calling these functions, e.g. by using a SfiMutex which is to be locked around engine API calls. * MasterThread; the engine, if configured accordingly, sets up one master thread which - processes transactions from the UserThread - schedules processing order of engine modules - processes single modules when required - processes module cycles when required - passes back processed transactions and flow jobs to the UserThread for garbage collection. * SlaveThread; the engine can be configured to spawn slave threads which, in addition to the master thread - process single modules when required - process module cycles when required. Communication at thread boundaries: * Job transaction queue; the UserThread constructs job transactions and enqueues them for the MasterThread. The UserThread also dequeues already processed transactions, in order for destroy functions of modules and accessors to only be executed within the UserThread. Also, the UserThread can wait (block) until all pending transactions have been processed by the MasterThread (in order to sync state with module network contained in the engine). * Flow job collection list; the MasterThread adds processed flow jobs into a collection queue, the UserThread then collects the queued flow jobs and frees them. * Module/cycle pool; the MasterThread fills in the module/cycle pool with modules which need to be processed. The MasterThread and the SlaveThreads pop modules/cycles from this pool, process them, and push back processed nodes. * load control; // FIXME Main loop integration: in order to process certain engine modules only from within the UserThread and to drive the engine even without master or slave threads, the engine can be hooked up to a main loop mechanism supplied by the UserThread. The engine provides API entry points to: - export file descriptors and timeout, suitable for main loop backends such as poll(2) - check whether dispatching is necessary - dispatch outstanding work to be performed by the engine TODO: ===== - self-input cycles need to be resolved in parent as well - needs description of pollfd/callback jobs - load control for slave threads Nov 27 2004 Tim Janik * reflect renames in the code base Sep 06 2003 Tim Janik * introduce time stamps indicating suspend state of a node (activation time) * update TODO Jul 15 2002 Tim Janik * describe virtual modules Jun 30 2002 Tim Janik * describe reset() * describe suspension * fix consumer node definition * TODO cleanups Jan 07 2002 Tim Janik * cosmetic updates, flow jobs Aug 19 2001 Tim Janik * notes on threads, communication, main loops Jul 29 2001 Tim Janik * wording/spelling fixups May 05 2001 Tim Janik * initial writeup LocalWords: BSE API BseModule EngineNode istreams ostreams A's B's sync LocalWords: bseengine SfiMutex UserThread MasterThread SlaveThread SlaveThreads |
||
|