Future Updates and Coming Soon

As my final quarter drew to a close at SCAD I was able finish v0.3 however due to time constraints I was unable to build a package in time to post to the web. This shall be a short term problem however and the update will be made available soon.

Along with this update there are two other development announcements for PyFarm. First and most importantly I am announcing the commencement of development towards PyFarmCE or Corporate Edition. This addition will be an add-on available to companies, workgroups, and advanced users that wish to use PyFarm in a centralized environment. More information about its features and specifications to follow. Lastly I would like to state that future updates to PyFarm and PyFarmCE should be viewed from the rss feeds, roadmaps, and wiki pages. This blog will be updated only during major releases or upon completion of new research.

Thanks and happy rendering!

Tracking Changes with Revision Notes

Revision notes are a useful way of tracking changes between revisions. In the past in order to view these notes you have either had to visit PyFarm’s launchpad website and dig down the find the revision notes or checkout the working branch itself. The latest revisions page provides a great deal information about development including new bugs, bug fixes, feature implementation, and much more. For your convenience these reports also include a link the original launchpad tree, date committed, and revision number. Click the link below to view the past ten revisions.

About -> Latest Revisions

Job Logging and UDP

Socket based communication falls into one of two primary categories, TCP or UDP. TCP or Transmission Control Protocol is used for many things including loading html pages, transferring files over the internet, and downloading email. UDP or User Datagram Protocol is used when the constant flow of data is more important than each individual packet. Examples include instant messaging, instrument monitoring, and online gaming. Overall UDP sockets are easier to use and implement but less reliable over time (see Table A for complete details). However when it comes to performance UDP is essential because it offers a connectionless (aka stateless) connection. This simply means that although data is being sent to the server the server does not reply back to the client with the packet state. This stateless connection leads to much lower latency and lower processing times making each individual connection faster. For something like process logging where the processes cannot be stopped or paused UDP offers the perfect solution because it does not wait for the connection to start, transmit, and properly terminate. In the upcoming release of v0.3 the job logging is performed by UDP socket connections, you can read more about TCP and UDP sockets on wikipedia

Table A.
TCP vs. UDP Packets

TCP UDP
  • End to end connection (connection oriented)
  • Error checking
  • Stream oriented
  • Guaranteed Delivery
  • Bidirectional
  • Moderate performance
  • No end to end connection (connectionless)
  • No error checking
  • Packet oriented
  • Delivery not guaranteed
  • Bidirectional
  • High Performance

Release Candidate 3: Complete

The third and final release candidate is now complete and available for download.  As this is a release candidate please be sure you read the associated wiki page before proceeding with your download.  Also included with this release is small readme file with some very installation instructions.

Displaying Large Sets of Data With Model-View Programming

Figure A

Figure A

Figure B

Figure B

One of the major problems when working with large sets of data is trying to display the information in a timely manner. As an example a user requests a 300 frame render on a maya file that contains four separate render passes resulting in 1,200 frames to process. In addition to those 1,200 frames there are at least eleven individual attributes that keep track of the frame’s information and status bringing the total number of data points up to 13,200. To generate, query, assign, and filter this data into an interface before displaying it to the user takes at least ten seconds each time the information is requested. This is far to long to wait each time for the interface to refresh so a better solution to this flow of data (as seen in Figure A) had to be found. After some extra research I came across the idea of Model/View programming, an architecture used to define the relationship and interaction of data with the interface. In the previous design the interface had to wait on all of the information to be assigned before rendering the interface itself, however by breaking this down into smaller steps massive sets of data can be displayed instantly. For example upon opening the job details interface only about eighteen frames will need to be displayed so instead of requesting 13,200 points of data only 198 need to be determined then mapped to the interface. This is where the new model comes into play and has many advantages over the previous design. When the user requests more frames by scrolling the data is then requested by the delegate which asks the model (which itself has the prepopulated data) which in turn informs the main table of the new frames. While this development has delayed the release till at least Mondy further delays are not predicted at this time.

Job Details Preview

Job Details Preview

Status Report: RC3 On Time

Several things have changed in the past week, most of which marks a major shift towards the final product. To start off with the 1300+ lines of code from the main program have been shifting into external libraries resulting in much cleaner and easier to read code. Also I have been working on developing methods to deploy PyFarm to systems other than Linux (code was already cross-platform, deployment has been the issue). As a result of this research an installer now exists for Mac OS X which will be deployed along side the third and final release candidate. Lastly, almost all status related events have been moved into an event based framework further reducing the amount of code required to inform the master of a client’s status. Stay tuned, release candidate three due out in about a week!

Remote Host Information

After a couple of days working to convert the old data management system info the new dictionary based system I have been able to provide the user with host information. This will allow the user to see specific info and stats about the selected node including IP, hostname, status, os, architecture, frames rendered, frames failed, frame failure rate, and software installed. Future versions of this widget may allow the user to disable rendering for specific software packages on a per node basis.

Remove host information retrieved from TCP socket

Remove host information retrieved from TCP socket

Python Mathematics - Calculating Mode

For PyFarm mathematics is key to informing the user about the status of their current render. While writing conditional statements can certainly query the information required to form a general conclusion it cannot derive an overall answer without data interpolation. Enter statistics a form of math dedicate to the collection, analysis, interpretation or explanation, and presentation of data. Using statistics PyFarm is able to inform the user of frames status, job status, shortest render time, longest render time, average render time, and much more. The function below demonstrates the calculation of mode see the comments for an explanation.

# create an empty dictionary to hold the data
frequency = {}

# for each number(x) in the input list(numList)
for x in numList:
    # if the number is in the dictionary(frequency)
    # append the count of that number, else start
    # couting
    if x in frequency:
        frequency[x] += 1
    else:
        frequency[x] = 1

# find the maxium value(s) in the dictionary
mode = max(frequency.values())

if mode == 1:
    mode = []
    return

# loop over the data and return tuples with the mode data and value
mode = [(x, mode) for x in frequency if (mode == frequency[x])]

# return mode
# [0][0] will return the key of the first tuple.
# We only want the first tuple which is the lowest key, therefor
# the 'safest' status to return.  Otherwise me might return failed as the
# current status even though there are just as many rendering frames.
return mode[0][0]

Job Management and Intermodule Communication

Previously job management was handled by a simple python based queue module. To add a frame for rendering a command would but ‘put’ into the queue and get() would be run when a new command is required for rendering. However, by running get() the command is removed from the queue and is no longer as an object to PyFarm. For certain items such as a the current state of a frame information must be retained for the duration of the program’s execution, without that information PyFarm would simply pass commands along without informing the user of their progress. By replacing the queue with a nested dictionary large amounts of data can be stored, modified, and queried quickly and easily with very little code required. With this new system tasks such as saving out the current que to formats such as XML (which also observes the parent -> child relationship) will allow the user to backup the current job information for later use. See the example job dictionary below for an example.

Example Job Dictionary:

jobs = { "Job1" : {
                    "status" : "Rendering",
                    "frames" : {
                                    "complete" : 1,
                                    "total" : 3,
                                    "failed" : 1
                                    },
                    "subjobs" : {
                                    "sub1" : {
                                                "status" : "Rendering",
                                                "complete" : 1,
                                                "total" : 3,
                                                "failed" : 1,
                                                "frames" : {
                                                                1 : {"status" : "Failed", "host" : "render01", "pid" : 9978, "command" : "render -r mr -v 5 -s 1 -e 1 scene.mb",
                                                                      "stdout" : "This is the first line in the log\nThis is the second"},

                                                                2 : {"status" : "Rendering", "host" : "render02", "pid" : 8243, "command" : "render -r mr -v 5 -s 2 -e 2 scene.mb",
                                                                "stdout" : "This is the first line in the log\nThis is the second"},

                                                                3 : {"status" : "Complete", "host" : "render03", "pid" : 10124, "command" : "render -r mr -v 5 -s 3 -e 3 scene.mb",
                                                                "stdout" : "This is the first line in the log\nThis is the second"},
                                                                }
                                                }
                                    }
                   }
        }

frames = jobs["Job1"]["subjobs"]["sub1"]["frames"]

for frame in frames.keys():
    thisFrame = frames[frame]
    print "Frame %i:\n\tStatus: %s\n\tHost: %s\n\tPID: %i\n\tCommand: %s\n\tLog: %s" % \
    (frame, thisFrame["status"], thisFrame["host"], thisFrame["pid"], thisFrame["command"], thisFrame["stdout"].split('\n'))

Output:

Frame 1:
	Status: Failed
	Host: render01
	PID: 9978
	Command: render -r mr -v 5 -s 1 -e 1 scene.mb
	Log: ['This is the first line in the log', 'This is the second']
Frame 2:
	Status: Rendering
	Host: render02
	PID: 8243
	Command: render -r mr -v 5 -s 2 -e 2 scene.mb
	Log: ['This is the first line in the log', 'This is the second']
Frame 3:
	Status: Complete
	Host: render03
	PID: 10124
	Command: render -r mr -v 5 -s 3 -e 3 scene.mb
	Log: ['This is the first line in the log', 'This is the second']

XML Settings

XML is a specification of markup language used to help organize and present information in a standardized format that can be read across multiple programs and platforms. Because of its wide array of uses and ease of accessibility from within Python I have opted to use it as the new format for PyFarm’s settings. Previously I was using a text document with :: and , delimiters. and while easy to understand in its own regard it required over twice the amount of processing time to separate the settings out for use inside of the program. In the future PyFarm might be using XML as the standard format for storing a backup queue file, stay tuned.

New PyFarm Settings File

<PyFarm>
    <settings>
        <network>
            <!-- General Network Settings -->
            <general type="host" value="127.0.0.1" />
            <general type="unit16" value="16" />

            <!-- Server Specific Settings -->
            <server type="admin" port="9630" />
            <server type="broadcast" port="9631" interval="15"/>
            <server type="que" port="9632" />
            <server type="stderr" port="9633" />
            <server type="stdout" port="9634" />
        </network>
    </settings>
    <software>
        <!-- General Software Settings -->
        <maya widgetIndex="0" fileGrep="Maya Scene File (*.mb *.ma)"/>
        <houdini widgetIndex="1" fileGrep="Houdini File (*.hip)"/>
        <shake widgetIndex="2" fileGrep="Shake Script (*.shk)"/>
        <!-- Maya Search Declarations -->
        <search os="linux" software="maya">
            <path>/usr/autodesk</path>
        </search>
        <search os="mac" software="maya">
            <path>/Applications/Autodesk</path>
        </search>
        <search os="win" software="maya">
            <path>C:\Program Files\Autodesk</path>
        </search>
        <!-- Hodini Search Declarations -->
        <search os="linux" software="houdini">
            <path>/opt</path>
            <path>/usr</path>
        </search>
        <search os="mac" software="houdini">
        </search>
        <search os="win" software="houdini">
                <path>C:\Program Files\Side Effects Software</path>
                <path>C:\Program Files(x86)\Side Effects Software</path>
        </search>
        <!-- Shake Search Declarations -->
        <search os="linux" software="shake">
            <path>/usr/apple/shake-v4.00.0607</path>
            <path>/opt/shake-v4.00.0607</path>
            <path>/usr/local/shake-v4.00.0607</path>
        </search>
        <search os="mac" software="shake">
            <path>/Applications/Shake/shake.app/Contents/MacOS</path>
        </search>
    </software>
</PyFarm>