Wednesday, November 18, 2009

Knot Pop Quiz

THE QUESTION
what knot is this? any idea?

THE ANSWER
Bowline.  In this form it's sometimes called a 'double bowline'.

In the old days this knot was used to tie the rope around your waist- no harness.  If the leader fell, and the rope didn't   break, the result was not pretty- broken ribs were common.   When  harnesses first came around the bowline, and later the  double  bowline, was the standard tie in knot. Some people still  use it  to tie in, it's supposed to be easier to untie after  being loaded  than a figure eight, but it has sharper bends in  it which means  it's a slight bit weaker than the figure eight.   Any bend in a  rope lowers its breaking strength, and the sharper the bend the  more the rope is weakened. The figure  eight has the least sharp  bends of any known knot, making it  the 'strongest'.  It is also  easy to tie and easy to  glance at  it and verify it is tied  correctly- which are the main reasons  it is preferred by most climbers today.

Note: This post was ghost written by me for "could have been my first guest blogger but refused" Neal Harder in response to http://twitter.com/sudarkoff/status/5836400206. aka I just copied his email response into this form since he's "allergic" to blogging. 

Wednesday, October 28, 2009

Twisted Reactor w/XML-RPC, POST, GET

Rewriting our Berkeley XMLDB services on Twisted, I decided to implement some couch db type interfaces like responding to post and get requests for DB actions. Oh, and don't forget the already in use xmlrpc interfaces to stay backwards compatible! It was surprisingly simple after staring at the docs for a day or two, so I'm posting a piece of the code in hopes that the example makes it easier for people new to twisted. Below you'll find an example which not only covers the post/get/xmlrpc thing, but also has threadpools, deferreds, xmlrpc kwargs, and a reactor example. There are some pieces missing since the server is actually much more complicated but hopefully you get the idea. Bon appetite!
#! /usr/bin/python
import os
import atexit
import signal
import sys
from xxx.services.utilities import write_pid, exit_function, handle_sigterm
from twisted.web import xmlrpc, server, resource
from twisted.internet import defer
from twisted.python import threadpool
from twisted.internet import reactor
'''
Configure logging
logging is a blocking operation, which violates twisted principles
so please use the logging module, for warnings/errors that should be
seen by everyone. DB queries et al, should go to the twisted logging
module, since theirs is non-blocking.
http://twistedmatrix.com/trac/wiki/TwistedLogging
'''
from twisted.python import log
'''
We are using two different versions of twisted between machines -
so lets see if we can accommodate both easily
'''
TWISTED_ABOVE_8_1 = False
from twisted import version as tversion
if tversion.minor >= 2:
TWISTED_ABOVE_8_1 = True
class TXmldb(xmlrpc.XMLRPC):
isLeaf = True
def __init__(self,):
xmlrpc.XMLRPC.__init__(self)
'''
Set up a thread pool
'''
self.threadpool = threadpool.ThreadPool(config.MIN_THREADS,
config.MAX_THREADS)
self.threadpool.start()
# support for handling rest requests as well
def render_GET(self, request):
func = request.path[1:] # omit leading slash
if func == 'get':
docName = request.args['howMuchHotness'][0]
defer.maybeDeferred(self.__get, howMuchHotness).addCallbacks(
self.finishup,
errback=self.error,
callbackArgs=(request,))
else:
return "No comprende senor!"
return server.NOT_DONE_YET
def render_POST(self, request):
request.setHeader("Connection", "Keep-Alive")
# if this is an xmlrpc call, there will be no args
# since it will be all marshalled up into a content
# body. So, checking for arg length should tell us
# accurately if this is xmlrpc or not
if not len(request.args):
return xmlrpc.XMLRPC.render_POST(self, request)
func = request.path[1:] # omit leading slash
# otherwise handle this like a post
data = request.args['data'][0]
if func == 'add':
defer.maybeDeferred(self.__add, data).addCallbacks(self.finishup,
errback=self.error,
callbackArgs=(request,))
else:
return "No comprende senor!"
return server.NOT_DONE_YET
def finishup(self, result, request):
# post and get only take string results
if result == True:
result = "1"
if result in [False, None]:
result = ""
request.write(result)
request.finish()
'''
---------------
XMLRPC INTERFACES
Note: In the past, we called this with positional arguments
but we also want support for passing a dictionary of keyword values
theoretically the argument orderd one gets phased out but, I
can see value in having both. the 'k' added to the suffix
stands for "Kwargs"
---------------
'''
def xmlrpc_get(self, howMuchHotness):
return self._deferToThread('__get', howMuchHotness)
def xmlrpc_getk(self, kwargs):
return self._deferToThread('__get', **kwargs)
def __get(self, howMuchHotness):
return "did something %s"%howMuch
def _deferToThread(self, f, *args, **kwargs):
if TWISTED_ABOVE_8_1:
return threads.deferToThreadPool(reactor, self.threadpool, f, *args, **kwargs)
else:
d = defer.Deferred()
self.threadpool.callInThread(threads._putResultInDeferred, d, f, args, kwargs)
return d
def __del__(self):
self.threadpool.stop()
def start(port, schema=None):
port = int(port)
log.msg("Initializing txmldb on port %s ..." % (port,) )
r = TXmldb()
reactor.listenTCP(port, server.Site(r))
reactor.run()
return reactor
def deamonize(port=7080, logFile="/var/log/txmldb.log", pidFile="/var/tmp/txmldb.pid"):
fp = open(logFile, 'a+b')
log.startLogging(fp)
try:
pid = os.fork()
if pid > 0:
# Exit first parent
sys.exit(0)
except OSError, e:
print >>sys.stderr, "fork #1 failed: %d (%s)" % (e.errno, e.strerror)
sys.exit(1)
# Decouple from parent environment
os.chdir("/")
os.setsid()
os.umask(0)
# Do second fork
try:
pid = os.fork()
if pid > 0:
# Exit from second parent
write_pid(pid, pidFile)
sys.exit(0)
except OSError, e:
print >>sys.stderr, "fork #2 failed: %d (%s)" % (e.errno, e.strerror)
sys.exit(1)
atexit.register(exit_function,pidFile)
# Start the daemon main loop
start(port)
if __name__ == "__main__":
import getopt
args = sys.argv[1:]
port=7080
logFile=None
pidFile=None
try:
opts, args = getopt.getopt(args, "p:l:P:d")
except getopt.GetoptError, e:
print "%s. options are -l (log file location) -P (pid file location) -d (detach) and -p (port)"%e
sys.exit(0)
detach = False
for opt, value in opts:
if opt == "-p":
port = value
elif opt == "-l":
logFile = value
elif opt == "-P":
pidFile = value
elif opt == "-d":
detach = True
if detach:
deamonize(port, logFile, pidFile)
else:
start(port)
view raw gistfile1.pyw hosted with ❤ by GitHub

Thursday, January 8, 2009

Plone LDAP and 450% speed increase rendering page load time

"Where I be workin' now we's goin through trubles, perfomance troubles solved by ma jigga... me?"...

ok - so I can't rap. big deal, neither can you. point is that we have been investigating the curmudgeouness in our plone 2.5.3 custom archetypes based product and came across this gem of a performance fart. our setup is weird I confess and I would be suprised if this actually applies to anyone but nontheless, thar she is.

we have many different users base dns in active directory that share the same group dns (scalability reasons) that map to zope roles. so we make plenty o' calls to see groups members to list them. turns out that this setup had something weird: our manager dn had permission to list other portals dns members but not to retrieve them. so if our user dn from one portal instance was "OU=AWESOME,DC=WE_ARE" and another was "OU=OK,DC=WE_ARE", they could share a groups DN of "OU=EDITORS_GROUP,DC=WE_ARE". The query to ldap for members of editors group would then return all user it can list, not edit, from both portals since they share this gruoping. Seems harmless enough right?

WRONG

(that could not be dramatic enough).

so for each user that comes back from the groups listing, there is a call to get that user. if that user call fails (i.e. the permission fails) the user is just ommitted from the list. so if those two portals each have 50 users in them, then there are 100 calls to get users from either portal, even though only 50% are accurate. oh, and each call is 1/10th of a second each.

security.declareProtected(manage_users, 'getGroupedUsers')
def getGroupedUsers(self, groups=None):
""" Return all those users that are in a group """
all_dns = {}
users = []
member_attrs = list(Set(GROUP_MEMBER_MAP.values()))
if groups is None:
groups = self.getGroups()
for group_id, group_dn in groups:
group_details = self.getGroupDetails(group_id)
for key, vals in group_details:
if key in member_attrs or key == '':
# If the key is an empty string then the groups are
# stored inside the user folder itself.
for dn in vals:
all_dns[dn] = 1
for dn in all_dns.keys():
# Only attempt to retrieve the user if their DN
# matches the Users Base DN
+if not dn.count(self.users_base):
+ user = None
+else:
try:
user = self.getUserByDN(dn)
except:
user = None
if user is not None:
users.append(user.__of__(self))
return tuple(users)