This is a more work related post. Nothing too specific just a big info dump while I walk into university this morning.

Let's first talk about serial processing which goes through its operations in the same order, waiting for one job to complete before moving onto the next. Higher clock speeds and higher bandwidth memory are fairly obvious ways of increasing the performance in this type of system.

In more complicated systems, the amount of IO and the limitations on industry grade processors make plugging everything into a single device and having it process all the data Q Q sufficiently high speed to provide the functionality is impossible.

So we've found that most of these systems are made up of many seperate systems running independently of one another. This has several advantages, if designed properly each system can be debugged and developed in isolation which is of significant value to the software engineers who don't want to have to know about every aspect of the vehicular software in order to develop for the platform.

However there are significant disadvantages to this approach, in many situations these isolated systems need to be able to communicate with one another because they are part of the same product. Another factor is a cost saving one, most manufacturers don't like the idea of having multiple sensors of the same type plugged into seperate systems for similar purposes thus increasing the total amount of communication that these originally isolated subsystems need to perform.

One approach to solving this problem is to make sensors the communicating elements in the problem and to move all the computation to a centralised location. This has several advantages and disadvantages many of which are cost factors. Most industries, for similar reasons to those mentioned above, like sourcing the cheapest solution to the problem provided it meets the operating criteria. Making something like a thermocouple a networked sensor would, at the simplest level, require the thermocouple itself, an ADC, microcontroller and a network transceiver. Admittedly the software on these sensors is fairly minimal and the only thing you might consider is adding some form of identification mechanism so the network itself can autoconfigure based around the sensors plugged into it.

Moving the computational nodes to a central location would at first seem to generate the same issues found when trying to use only one system to do everything. However the proximity provides us with another solution to that problem involving what I'll call functional dependency IPC reduction. Independent of the system, be it multi threaded shared memory (think Unified Memory Addressing (UMA) in the PX2) or a multi discrete system approach often found in computer clusters, it is advantageous to dedicate the majority of the computational resources to the task in question not having the system wait for shared resources (locks and mutexes). While the mechanism to reduce these blocking operations will differ between the types of system, the solution is best served by being located, in a network sense at least, within a short response time of one another. The fastest interconnect is usually considered to be the system memory and as such it would typically be considered the optimal solution to unify all processing into a single system with one unified memory address. What difficulties will this approach face when paired with a real time system? Well unfortunately modern processors aren't typically optimised at a hardware level for the interprocess communication required to manage access to the memory so additional computational overhead will exist for every additional bit of shared memory. This isn't so bad you say... While a hypervisor could try to ensure that the conflicts for memory are minimised by trying to reduce the number of threads running conflicting functions this adds an additional task to the system which is keeping a track of he resource allocation. A multi system approach could allow for a partial solution to this... We don't know for certain.

If multi-node compute system were to be devised, one that contains a dedicated hypervisor and slave nodes that executed the commands issued by the hypervisor the optimisation could be allocated to the dedicated hypervisor, again falling back to the same problem of adding additional compute complexity but in this case the load won't impact the slaves.

This was just me blathering on about some ideas. Nothing to concrete yet.