diff --git a/README.html b/README.html index a0b8e65..5743961 100644 --- a/README.html +++ b/README.html @@ -9,8 +9,7 @@
-NOTE: I have been studying a lot about threading for the past few months and have some awesome additions in store! They will take a while to come out though. The goal of the library is still to provide a simple and efficient way to multi task in lua
In Changes you’ll find documentation for (In Order):
My multitasking library for lua. It is a pure lua binding if you ignore the integrations and the love2d compat. If you find any bugs or have any issues, please let me know :). If you don’t see a table of contents try using the ReadMe.html file. It is easier to navigate the readme
My multitasking library for lua. It is a pure lua binding, if you ignore the integrations and the love2d compat. If you find any bugs or have any issues, please let me know . If you don’t see a table of contents try using the ReadMe.html file. It is easier to navigate than readme
Note: The latest version of Lua lanes is required if you want to make use of system threads on lua 5.1+. I will update the dependencies for Lua rocks since this library should work fine on lua 5.1+
To install copy the multi folder into your environment and you are good to go
If you want to use the system threads, then you’ll need to install lanes!
or use luarocks
luarocks install bin -- To use the new save state stuff
-luarocks install multi
+INSTALLING
Note: The latest version of Lua lanes is required if you want to make use of system threads on lua 5.1+. I will update the dependencies for Lua rocks since this library should work fine on lua 5.1+ You also need the lua-net library and the bin library. all installed automatically using luarocks. however you can do this manually if lanes and luasocket are installed. Links:
https://github.com/rayaman/bin
https://github.com/rayaman/multi
https://github.com/rayaman/net
To install copy the multi folder into your environment and you are good to go
If you want to use the system threads, then you’ll need to install lanes!
or use luarocks
luarocks install multi
Note: Soon you may be able to run multitasking code on multiple machines, network parallelism. This however will have to wait until I hammer out some bugs within the core of system threading itself.
See the rambling section to get an idea of how this will work.
Discord
For real-time assistance with my libraries! A place where you can ask questions and get help with any of my libraries. Also, you can request features and stuff there as well.
https://discord.gg/U8UspuA
Upcoming Plans: Adding network support for threading. Kind of like your own lua cloud. This will require the bin, net, and multi library. Once that happens I will include those libraries as a set. This also means that you can expect both a standalone and joined versions of the libraries.
Planned features/TODO
--
Add system threads for love2d that works like the lanesManager (loveManager, slight differences). -
Improve performance of the library -
Improve coroutine based threading scheduling - Improve love2d Idle thread CPU usage/Fix the performance when using system threads in love2d… Tricky Look at the rambling section for insight.
-
Add more control to coroutine based threading - Add more control to system-based threading
- Make practical examples that show how you can solve real problems
-
Add more features to support module creators -
Make a framework for easier thread task distributing -
Fix Error handling on threaded multi objects Non-threaded multiobjs will crash your program if they error though! Use multi:newThread() of multi:newSystemThread() if your code can error! Unless you use multi:protect() this however lowers performance! -
Add multi:OnError(function(obj,err)) - sThread.wrap(obj) May or may not be completed Theory: Allows interaction in one thread to affect it in another. The addition to threaded tables may make this possible!
- SystemThreaded Actors — After some tests I figured out a way to make this work… It will work slightly different though. This is due to the actor needing to be split able…
- Load Balancing for system threads (Once SystemThreaded Actors are done)
-
Add more integrations - Fix SystemThreadedTables
- Finish the wiki stuff. (11% done)
- Test for unknown bugs
Known Bugs/Issues
Regarding integrations, thread cancellation works slightly different for love2d and lanes. Within love2d I was unable to (Too lazy to…) not use the multi library within the thread. A fix for this is to call multi:Stop() when you are done with your threaded code! This may change however if I find a way to work around this. In love2d to mimic the GLOBAL table I needed the library to constantly sync the data… You can use the sThread.waitFor(varname), or sThread.hold(func) methods to sync the global data, to get the value instead of using GLOBAL and this could work. If you want to go this route I suggest setting multi.isRunning=true to prevent the auto runner from doing its thing! This will make the multi manager no longer function, but that’s the point :P THREAD.kill() should do the trick from within the thread. A listener could be made to detect when thread kill has been requested and sent to the running thread.
Another bug concerns the SystemThreadedJobQueue, only 1 can be used for now. Going to change in a future update
And systemThreadedTables only supports 1 table between the main and worker thread! They do not work when shared between 2 or more threads. If you need that much flexibility use the GLOBAL table that all threads have. FIXED
For module creators using this library. I suggest using SystemThreadedQueues for data transfer instead of SystemThreadedTables for rapid data transfer, if you plan on having Constants that will always be the same then a table is a good idea! They support up to n threads and can be messed with and abused as much as you want :D FIXED Use what you want!
Love2D SystemThreadedTables do not send love2d userdata, use queues instead for that! FIXED
Usage:
Make practical examples that show how you can solve real problems
A bug concerns the SystemThreadedJobQueue, only 1 can be used for now. Might change in a future update
-- Note: the node will contain a log of all the commands that it gets. A file called "NodeName.log" will contain the info. You can set the limit by lines or file size. Also, you can set it to clear the log every interval of time if an error does not exist. All errors are both logged and sent to the host as well. You can have more than one host and more than one node(duh :P).
-The goal of the node is to set up a simple and easy way to run commands on a remote machine.
There are 2 main ways you can use this feature. 1. One node per machine with system threads being able to use the full processing power of the machine. 2. Multiple nodes on one machine where each node is acting like its own thread. And of course, a mix of the two is indeed possible.
Love2d Sleeping reduces the CPU time making my load detection think the system is under more load, thus preventing it from sleeping… I will investigate other means. As of right now it will not eat all your CPU if threads are active. For now, I suggest killing threads that aren’t needed anymore. On lanes threads at idle use 0% CPU and it is amazing. A state machine may solve what I need though. One state being idle state that sleeps and only goes into the active state if a job request or data is sent to it… after some time of not being under load it will switch back into the idle state… We’ll see what happens.
Love2d doesn’t like to send functions through channels. By default, it does not support this. I achieve this by dumping the function and loadstring it on the thread. This however is slow. For the System Threaded Job Queue, I had to change my original idea of sending functions as jobs. The current way you do it now is register a job functions once and then call that job across the thread through a queue. Each worker thread pops from the queue and returns the job. The Job ID is automatically updated and allows you to keep track of the order that the data comes in. A table with # indexes can be used to organize the data…
Regarding benchmarking. If you see my bench marks and are wondering they are 10x better it’s because I am using luajit for my tests. I highly recommend using luajit for my library, but lua 5.1 will work just as well, but not as fast.
So, while working on the jobQueue:doToAll() method I figured out why love2d’s threaded tables were acting up when more than 1 thread was sharing the table. It turns out 1 thread was eating all the pops from the queue and starved all the other queues… I’ll need to use the same trick I did with GLOBAL to fix the problem… However, at the rate I am going threading in love will become way slower. I might use the regular GLOBAL to manage data internally for threadedtables…
It has been awhile since I had to bring out the Multi Functions… Syncing within threads are a pain! I had no idea what a task it would be to get something as simple as syncing data was going to be… I will probably add a SystemThreadedSyncer in the future because it will make life easier for you guys as well. SystemThreadedTables are still not going to work on love2d, but will work fine on lanes… I have a solution and it is being worked on… Fixed this :D. Depending on when I push the next update to this library the second half of this ramble won’t apply anymore
I have been using this (EventManager —> MultiManager —> now multi) for my own purposes and started making this when I first started learning lua. You can see how the code changed and evolved throughout the years. I tried to include all the versions that still existed on my HDD.
I added my old versions to this library… It started out as the EventManager and was kind of crappy, but it was the start to this library. It kept getting better and better until it became what it is today. There are some features that no longer exist in the latest version, but they were remove because they were useless… I added these files to the GitHub so for those interested can see into my mind in a sense and see how I developed the library before I used GitHub.
The first version of the EventManager was function based not object based and benched at about 2000 steps per second… Yeah that was bad… I used loadstring and it was a mess… Look and see how it grew throughout the years I think it may interest some of you guys!
+The goal of the node is to set up a simple and easy way to run commands on a remote machine.
There are 2 main ways you can use this feature. 1. One node per machine with system threads being able to use the full processing power of the machine. 2. Multiple nodes on one machine where each node is acting like its own thread. And of course, a mix of the two is indeed possible.
Love2d Sleeping reduces the CPU time making my load detection think the system is under more load, thus preventing it from sleeping… I will investigate other means. As of right now it will not eat all your CPU if threads are active. For now, I suggest killing threads that aren’t needed anymore. On lanes threads at idle use 0% CPU and it is amazing. A state machine may solve what I need though. One state being idle state that sleeps and only goes into the active state if a job request or data is sent to it… after some time of not being under load it will switch back into the idle state… We’ll see what happens.
Love2d doesn’t like to send functions through channels. By default, it does not support this. I achieve this by dumping the function and loadstring it on the thread. This however is slow. For the System Threaded Job Queue, I had to change my original idea of sending functions as jobs. The current way you do it now is register a job functions once and then call that job across the thread through a queue. Each worker thread pops from the queue and returns the job. The Job ID is automatically updated and allows you to keep track of the order that the data comes in. A table with # indexes can be used to organize the data…
Regarding benchmarking. If you see my bench marks and are wondering they are 10x better it’s because I am using luajit for my tests. I highly recommend using luajit for my library, but lua 5.1 will work just as well, but not as fast.
So, while working on the jobQueue:doToAll() method I figured out why love2d’s threaded tables were acting up when more than 1 thread was sharing the table. It turns out 1 thread was eating all the pops from the queue and starved all the other queues… I’ll need to use the same trick I did with GLOBAL to fix the problem… However, at the rate I am going threading in love will become way slower. I might use the regular GLOBAL to manage data internally for threadedtables…
I have been using this (EventManager —> MultiManager —> now multi) for my own purposes and started making this when I first started learning lua. You can see how the code changed and evolved throughout the years. I tried to include all the versions that still existed on my HDD.
I added my old versions to this library… It started out as the EventManager and was kind of crappy, but it was the start to this library. It kept getting better and better until it became what it is today. There are some features that no longer exist in the latest version, but they were remove because they were useless… I added these files to the GitHub so for those interested can see into my mind in a sense and see how I developed the library before I used GitHub.
The first version of the EventManager was function based not object based and benched at about 2000 steps per second… Yeah that was bad… I used loadstring and it was a mess… Look and see how it grew throughout the years I think it may interest some of you guys!
diff --git a/README.md b/README.md index 3e6c4d5..c99600e 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# multi Version: 12.1.0 Fixing bugs and making the library eaiser to use +# multi Version: 12.2.0 Added better priority management, function chaining, and some bug fixes My multitasking library for lua. It is a pure lua binding, if you ignore the integrations and the love2d compat. If you find any bugs or have any issues, please let me know . **If you don't see a table of contents try using the ReadMe.html file. It is easier to navigate than readme** diff --git a/changes.html b/changes.html index 235c3f4..45240a2 100644 --- a/changes.html +++ b/changes.html @@ -12,184 +12,303 @@Note: After doing some testing, I have noticed that using multi-objects are slightly, quite a bit, faster than using (coroutines)multi:newthread(). Only create a thread if there is no other possibility! System threads are different and will improve performance if you know what you are doing. Using a (coroutine)thread as a loop with a timer is slower than using a TLoop! If you do not need the holding features I strongly recommend that you use the multi-objects. This could be due to the scheduler that I am using, and I am looking into improving the performance of the scheduler for (coroutine)threads. This is still a work in progress so expect things to only get better as time passes! This was the reason threadloop was added. It binds the thread scheduler into the mainloop allowing threads to run much faster than before. Also the use of locals is now possible since I am not dealing with seperate objects. And finally reduced function overhead helps keep the threads running better.
Added:
-- All methods that did not return before now return a copy of itself. Thus allowing chaining. Most if not all mutators returned nil, so chaining can now be done. I will eventually write up a full documentation of everything which will show this.
+multi = require("multi")
+multi:newStep(1,100):OnStep(function(self,i)
+ print("Index: "..i)
+end):OnEnd(function(self)
+ print("Step is done!")
+end)
+multi:mainloop{
+ priority = 3
+}
+
+Priority 3 works a bit differently than the other 2.P1 follows a forumla that resembles this: ~n=I*PRank where n is the amount of steps given to an object with PRank and where I is the idle time see chart below. The aim of this priority scheme was to make core objects run fastest while letting idle processes get decent time as well.
C: 3322269 ~I*7
+H: 2847660 ~I*6
+A: 2373050 ~I*5
+N: 1898440 ~I*4
+B: 1423830 ~I*3
+L: 949220 ~I*2
+I: 474610 ~I
+~n=I*PRank
+P2 follows a formula that resembles this: ~n=n*4 where n is the idle time, see chart below. The goal of this one was to make core process’ higher while keeping idle process’ low.
C: 6700821
+H: 1675205
+A: 418801
+N: 104700
+B: 26175
+L: 6543
+I: 1635
+~n=n*4
+P3 Ignores using a basic funceion and instead bases its processing time on the amount of cpu time is there. If cpu-time is low and a process is set at a lower priority it will get its time reduced. There is no formula, at idle almost all process work at the same speed!
C: 2120906
+H: 2120906
+A: 2120906
+N: 2120906
+B: 2120906
+L: 2120906
+I: 2120506
+Auto Priority works by seeing what should be set high or low. Due to lua not having more persicion than milliseconds, I was unable to have a detailed manager that can set things to high, above normal, normal, ect. This has either high or low. If a process takes longer than .001 millisecond it will be set to low priority. You can change this by using the setting autolowest = multi.Priority[PLevel] the defualt is low, not idle, since idle tends to get about 1 process each second though you can change it to idle using that setting.
Improved:
I usually give an example of the changes made, but this time I have an explantion for multi.nextStep(). It’s not an entirely new feature since multi:newJob() does something like this, but is completely different. nextStep addes a function that is executed first on the next step. If multiple things are added to next step, then they will be executed in the order that they were added.
Note:
The upper limit of this libraries performance on my machine is ~39mil. This is simply a while loop counting up from 0 and stops after 1 second. The 20mil that I am currently getting is probably as fast as it can get since its half of the max performance possible, and each layer I have noticed that it doubles complexity. Throughout the years with this library I have seen massive improvements in speed. In the beginning we had only ~2000 steps per second. Fast right? then after some tweaks we went to about 300000 steps per second, then 600000. Some more tweaks brought me to ~1mil steps per second, then to ~4 mil then ~9 mil and now finally ~20 mil… the doubling effect that i have now been seeing means that odds are I have reach the limit. I will aim to add more features and optimize individule objects. If its possible to make the library even faster then I will go for it.
Fixed:
Changed:
package.path="?/init.lua;?.lua;"..package.path
+multi = require("multi")
+local a = 0
+multi:newThread("test",function()
+ print("lets go")
+ b,c = thread.hold(function() -- This now returns what was managed here
+ return b,"We did it!"
+ end)
+ print(b,c)
+end)
+multi:newTLoop(function()
+ a=a+1
+ if a == 5 then
+ b = "Hello"
+ end
+end,1)
+multi:mainloop()
+Note: Only if the first return is non-nil/false will any other returns be passed! So while variable b above is nil the string “We did it!” will not be passed. Also while this seems simple enough to get working, I had to modify a bit on how the scheduler worked to add such a simple feature. Quite a bit is going on behind the scenes which made this a bit tricky to implement, but not hard. Just needed a bit of tinkering. Plus event objects have not been edited since the creation of the EventManager. They have remained mostly the same since 2011
Contunue to make small changes as I come about them. This change was inspired when working of the net library. I was addind simple binary file support over tcp, and needed to pass the data from the socket when the requested amount has been recieved. While upvalues did work, i felt returning data was cleaner and added this feature.
Note: After doing some testing, I have noticed that using multi-objects are slightly, quite a bit, faster than using (coroutines)multi:newthread(). Only create a thread if there is no other possibility! System threads are different and will improve performance if you know what you are doing. Using a (coroutine)thread as a loop with a This was the reason threadloop was added. It binds the thread scheduler into the mainloop allowing threads to run much faster than before. Also the use of locals is now possible since I am not dealing with seperate objects. And finally, reduced function overhead help keeps the threads running better.
is slower than using a TLoop! If you do not need the holding features I strongly recommend that you use the multi-objects. This could be due to the scheduler that I am using, and I am looking into improving the performance of the scheduler for (coroutine)threads. This is still a work in progress so expect things to only get better as time passes!
nGLOBAL = require("multi.integration.networkManager").init()node = multi:newNode(tbl: settings)master = multi:newMaster(tbl: settings)multi:nodeManager(port)thread.isThread() — for coroutine based threadsChanged:
Note On Queues: When it comes to network queues, they only send 1 way. What I mean by that is that if the master sends a message to a node, its own queue will not get populated at all. The reason for this is because syncing between which popped from what network queue would make things really slow and would not perform well at all. This means you have to code a bit differently. Use: master getFreeNode() to get the name of the node under the least amount of load. Then handle the sending of data to each node that way.
Now there is a little trick you can do. If you combine both networkmanager and systemthreading manager, then you could have a proxy queue for all system threads that can pull from that “node”. Now data passing within a lan network, (And wan network if using the node manager, though p2p isn’t working as i would like and you would need to open ports and make things work. Remember you can define an port for your node so you can port forward that if you want), is fast enough, but the waiting problem is something to consider. Ask yourseld what you are coding and if network paralisim is worth using.
Note: These examples assume that you have already connected the nodes to the node manager. Also you do not need to use the node manager, but sometimes broadcast does not work as expected and the master doesnot connect to the nodes. Using the node manager offers nice features like: removing nodes from the master when they have disconnected, and automatically telling the master when nodes have been added. A more complete example showing connections regardless of order will be shown in the example folder check it out. New naming scheme too.
NodeManager.lua
Going forward:
+- I am really excited to finally get this update out there, but left one important thing out. enabling of enviroments for each master connected to a node. This would allow a node to isolate code from multiple masters so they cannot interact with each other. This will come out in version 12.1.0 But might take a while due to the job hunt that I am currently going through.
- Another feature that I am on the fence about is adding channels. They would work like queues, but are named so you can seperate the data from different channels where only one portion of can see certain data.
- I also might add a feature that allows different system threads to consume from a network queue if they are spaned on the same physical machine. This is possible at the moment, just doesn’t have a dedicated object for handling this seamlessly. You can do this yourself though.
- Another feature that I am thinking of adding is crosstalk which is a setting that would allow nodes to talk to other nodes. I did not add it in this release since there are some issues that need to be worked out and its very messy atm. however since nodes are named. I may allow by default pushing data to another node, but not have the global table to sync since this is where the issue lies.
- Improve Performance
- Fix supporting libraries (Bin, and net need tons of work)
- Look for the bugs
- Figure out what I can do to make this library more awesome
Note On Queues: When it comes to network queues, they only send 1 way. What I mean by that is that if the master sends a message to a node, its own queue will not get populated at all. The reason for this is because syncing between which popped from what network queue would make things really slow and would not perform well at all. This means you have to code a bit differently. Use: master getFreeNode() to get the name of the node under the least amount of load. Then handle the sending of data to each node that way.
Now there is a little trick you can do. If you combine both networkmanager and systemthreading manager, then you could have a proxy queue for all system threads that can pull from that “node”. Now data passing within a lan network, (And wan network if using the node manager, though p2p isn’t working as i would like and you would need to open ports and make things work. Remember you can define an port for your node so you can port forward that if you want), is fast enough, but the waiting problem is something to consider. Ask yourseld what you are coding and if network paralisim is worth using.
Note: These examples assume that you have already connected the nodes to the node manager. Also you do not need to use the node manager, but sometimes broadcast does not work as expected and the master doesnot connect to the nodes. Using the node manager offers nice features like: removing nodes from the master when they have disconnected, and automatically telling the master when nodes have been added. A more complete example showing connections regardless of order will be shown in the example folder check it out. New naming scheme too.
NodeManager.lua