{"input": "Any experiences with Protocol Buffers? I was just looking through some information about Google's protocol buffers data interchange format.  Has anyone played around with the code or even created a project around it?\nI'm currently using XML in a Python project for structured content created by hand in a text editor, and I was wondering what the general opinion was on Protocol Buffers as a user-facing input format.  The speed and brevity benefits definitely seem to be there, but there are so many factors when it comes to actually generating and processing the data.", "output": "Protocol buffers are intended to optimize communications between machines. They are really not intended for human interaction. Also, the format is binary, so it could not replace XML in that use case. \nI would also recommend JSON as being the most compact text-based format."}
{"input": "104, 'Connection reset by peer' socket error, or When does closing a socket result in a RST rather than FIN? We're developing a Python web service and a client web site in parallel.  When we make an HTTP request from the client to the service, one call consistently raises a socket.error in socket.py, in read:\n(104, 'Connection reset by peer')\nWhen I listen in with wireshark, the \"good\" and \"bad\" responses look very similar:\n\nBecause of the size of the OAuth header, the request is split into two packets.  The service responds to both with ACK\nThe service sends the response, one packet per header (HTTP/1.0 200 OK, then the Date header, etc.).  The client responds to each with ACK.\n(Good request) the server sends a FIN, ACK.  The client responds with a FIN, ACK.  The server responds ACK.\n(Bad request) the server sends a RST, ACK, the client doesn't send a TCP response, the socket.error is raised on the client side.\n\nBoth the web service and the client are running on a Gentoo Linux x86-64 box running glibc-2.6.1.  We're using Python 2.5.2 inside the same virtual_env.\nThe client is a Django 1.0.2 app that is calling httplib2 0.4.0 to make requests.  We're signing requests with the OAuth signing algorithm, with the OAuth token always set to an empty string.\nThe service is running Werkzeug 0.3.1, which is using Python's wsgiref.simple_server.  I ran the WSGI app through wsgiref.validator with no issues.\nIt seems like this should be easy to debug, but when I trace through a good request on the service side, it looks just like the bad request, in the socket._socketobject.close() function, turning delegate methods into dummy methods.  When the send or sendto (can't remember which) method is switched off, the FIN or RST is sent, and the client starts processing.\n\"Connection reset by peer\" seems to place blame on the service, but I don't trust httplib2 either.  Can the client be at fault?\n** Further debugging - Looks like server on Linux **\nI have a MacBook, so I tried running the service on one and the client website on the other.  The Linux client calls the OS X server without the bug (FIN ACK).  The OS X client calls the Linux service with the bug (RST ACK, and a (54, 'Connection reset by peer')).  So, it looks like it's the service running on Linux.  Is it x86_64? A bad glibc? wsgiref? Still looking...\n** Further testing - wsgiref looks flaky **\nWe've gone to production with Apache and mod_wsgi, and the connection resets have gone away.  See my answer below, but my advice is to log the connection reset and retry.  This will let your server run OK in development mode, and solidly in production.\n", "output": "I've had this problem.  See The Python \"Connection Reset By Peer\" Problem.\nYou have (most likely) run afoul of small timing issues based on the Python Global Interpreter Lock.\nYou can (sometimes) correct this with a time.sleep(0.01) placed strategically.  \n\"Where?\" you ask.  Beats me.  The idea is to provide some better thread concurrency in and around the client requests.  Try putting it just before you make the request so that the GIL is reset and the Python interpreter can clear out any pending threads.\n"}
{"input": "svg diagrams using python I am looking for a library to generate svg diagrams in python (I fetch data from a sql database). I have found python-gd, but it has not much documentation and last update was in 2005 so I wonder if there are any other libraries that are good for this purpose.\nI am mostly thinking about simple line graphs, something like this:\n\n", "output": "As you're looking for simple line graphics, probably, CairoPlot will fit your needs as it can generate svg output files out of the box. Take a look at this.\n\nThis example image shows only a few of its capabilities. Using the trunk version available at launchpad you'll be able to add a legend box and add axis titles.\nBesides that, using the trunk version, it's possible to generate:\n\nDotLine charts (the ones I believe you need)\nScatter charts\nPie/Donut charts\nHorizontal/Vertical Bar charts\nGantt charts\n\n"}
{"input": "Fastest Way to Delete a Line from Large File in Python I am working with a very large (~11GB) text file on a Linux system.  I am running it through a program which is checking the file for errors.  Once an error is found, I need to either fix the line or remove the line entirely.  And then repeat...\nEventually once I'm comfortable with the process, I'll automate it entirely.  For now however, let's assume I'm running this by hand.\nWhat would be the fastest (in terms of execution time) way to remove a specific line from this large file?  I thought of doing it in Python...but would be open to other examples.  The line might be anywhere in the file.\nIf Python, assume the following interface:\ndef removeLine(filename, lineno):\nThanks,\n-aj\n", "output": "You can have two file objects for the same file at the same time (one for reading, one for writing):\ndef removeLine(filename, lineno):\n    fro = open(filename, \"rb\")\n\n    current_line = 0\n    while current_line < lineno:\n        fro.readline()\n        current_line += 1\n\n    seekpoint = fro.tell()\n    frw = open(filename, \"r+b\")\n    frw.seek(seekpoint, 0)\n\n    # read the line we want to discard\n    fro.readline()\n\n    # now move the rest of the lines in the file \n    # one line back \n    chars = fro.readline()\n    while chars:\n        frw.writelines(chars)\n        chars = fro.readline()\n\n    fro.close()\n    frw.truncate()\n    frw.close()\n\n"}
{"input": "How to create simple web site with python? How to create simple web site with python? \nI mean really simple, f.ex, you see text \"Hello World\", and there are button \"submit\", which (onClick) will show ajax box \"submit successful\". \nI want to start develop some stuff with Python, and I don't know where to start ;)\n", "output": "I was hoping more elaborate answers would be given to this question, since it's a sensitive subject. Python web developing is split across lots of frameworks, each with its ups and downs and every developer using a different one. This is quite unfortunate. What you should know:\n\nUse WSGI. Don't use anything else, WSGI is the latest standard in Python web developing;\nDon't develop directly on top of WSGI, unless you really have to (not even for a hello world app);\nUse a framework that best suits your needs:\n\nI played with Werkzeug (which is not really a platform, they call it a toolkit) because it's really simple yet powerful. It lets you work on the WSGI level (also helps you understand how WSGI works) while providing really useful features and helpers. On the Werkzeug website you will also find useful tutorials and things like that.\nProbably the most popular framework is Django. Never used it, but maybe there's a reason for why it is so popular.\n\n\nIn conclusion, use whatever is closest to your heart.\n"}
{"input": "Extract images from PDF without resampling, in python? How might one extract all images from a pdf document, at native resolution and format? (Meaning extract tiff as tiff, jpeg as jpeg, etc. and without resampling). Layout is unimportant, I don't care were the source image is located on the page.\nI'm using python 2.7 but can use 3.x if required.\n", "output": "Often in a PDF, the image is simply stored as-is.  For example, a PDF with a jpg inserted will have a range of bytes somewhere in the middle that when extracted is a valid jpg file.  You can use this to very simply extract byte ranges from the PDF.  I wrote about this some time ago, with sample code: Extracting JPGs from PDFs.\n"}
{"input": "Fastest Way to generate 1,000,000+ random numbers in python I am currently writing an app in python that needs to generate large amount of random numbers, FAST. Currently I have a scheme going that uses numpy to generate all of the numbers in a giant batch (about ~500,000 at a time). While this seems to be faster than python's implementation. I still need it to go faster. Any ideas? I'm open to writing it in C and embedding it in the program or doing w/e it takes.\nConstraints on the random numbers:\n\nA Set of 7 numbers that can all have different bounds:\n\n\neg: [0-X1, 0-X2, 0-X3, 0-X4, 0-X5, 0-X6, 0-X7]\nCurrently I am generating a list of 7 numbers with random values from [0-1) then multiplying by [X1..X7]\n\nA Set of 13 numbers that all add up to 1\n\n\nCurrently just generating 13 numbers then dividing by their sum\n\n\nAny ideas? Would pre calculating these numbers and storing them in a file make this faster?\nThanks!\n", "output": "You can speed things up a bit from what mtrw posted above just by doing what you initially described (generating a bunch of random numbers and multiplying and dividing accordingly)...  \nAlso, you probably already know this, but be sure to do the operations in-place (*=, /=, +=, etc) when working with large-ish numpy arrays.  It makes a huge difference in memory usage with large arrays, and will give a considerable speed increase, too.\nIn [53]: def rand_row_doubles(row_limits, num):\n   ....:     ncols = len(row_limits)\n   ....:     x = np.random.random((num, ncols))\n   ....:     x *= row_limits                  \n   ....:     return x                          \n   ....:                                       \nIn [59]: %timeit rand_row_doubles(np.arange(7) + 1, 1000000)\n10 loops, best of 3: 187 ms per loop\n\nAs compared to:\nIn [66]: %timeit ManyRandDoubles(np.arange(7) + 1, 1000000)\n1 loops, best of 3: 222 ms per loop\n\nIt's not a huge difference, but if you're really worried about speed, it's something.\nJust to show that it's correct:\nIn [68]: x.max(0)\nOut[68]:\narray([ 0.99999991,  1.99999971,  2.99999737,  3.99999569,  4.99999836,\n        5.99999114,  6.99999738])\n\nIn [69]: x.min(0)\nOut[69]:\narray([  4.02099599e-07,   4.41729377e-07,   4.33480302e-08,\n         7.43497138e-06,   1.28446819e-05,   4.27614385e-07,\n         1.34106753e-05])\n\nLikewise, for your \"rows sum to one\" part...\nIn [70]: def rand_rows_sum_to_one(nrows, ncols):\n   ....:     x = np.random.random((ncols, nrows))\n   ....:     y = x.sum(axis=0)\n   ....:     x /= y\n   ....:     return x.T\n   ....:\n\nIn [71]: %timeit rand_rows_sum_to_one(1000000, 13)\n1 loops, best of 3: 455 ms per loop\n\nIn [72]: x = rand_rows_sum_to_one(1000000, 13)\n\nIn [73]: x.sum(axis=1)\nOut[73]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])\n\nHonestly, even if you re-implement things in C, I'm not sure you'll be able to beat numpy by much on this one...  I could be very wrong, though!\n"}
{"input": "set axis limits in matplotlib pyplot I have two subplots in a figure. I want to set the axes of the second subplot such that it has the same limits as the first subplot (which changes depending on the values plotted). Can someone please help me? Here is the code:\nimport matplotlib.pyplot as plt\n\nplt.figure(1, figsize = (10, 20))\n## First subplot: Mean value in each period (mean over replications)\nplt.subplot(211, axisbg = 'w')\nplt.plot(time,meanVector[0:xMax], color = '#340B8C', \n         marker = 'x', ms = 4, mec = '#87051B', markevery = (asp, \n                                                             2*asp))\nplt.xticks(numpy.arange(0, T+1, jump), rotation = -45)\nplt.axhline(y = Results[0], color = '#299967', ls = '--')\nplt.ylabel('Mean Value')\nplt.xlabel('Time')\nplt.grid(True)\n\n\n## Second subplot: moving average for determining warm-up period\n## (Welch method)\nplt.subplot(212)    \nplt.plot(time[0:len(yBarWvector)],yBarWvector, color = '#340B8C')\nplt.xticks(numpy.arange(0, T+1, jump), rotation = -45)\nplt.ylabel('yBarW')\nplt.xlabel('Time')\nplt.xlim((0, T))\nplt.grid(True)\n\nIn the second subplot, what should be the arguments for plt.ylim() function? I tried defining \nymin, ymax = plt.ylim()\n\nin the first subplot and then set \nplt.ylim((ymin,ymax))\n\nin the second subplot. But that did not work, because the returned value ymax is the maximum value taken by the y variable (mean value) in the first subplot and not the upper limit of the y-axis. \nThanks in advance. \n", "output": "I searched some more on the matplotlib website and figured a way to do it. If anyone has a better way, please let me know. \nIn the first subplot replace plt.subplot(211, axisbg = 'w') by ax1 = plt.subplot(211, axisbg = 'w')\n. Then, in the second subplot, add the arguments sharex = ax1 and sharey = ax1 to the subplot command. That is, the second subplot command will now look:\nplt.subplot(212, sharex = ax1, sharey = ax1)\n\nThis solves the problem. But if there are other better alternatives, please let me know.\n"}
{"input": "Proving the primality of strong probable primes Using the probabilistic version of the Miller-Rabin test, I have generated a list of medium-large (200-300 digit) probable primes. But probable ain't good enough! I need to know these numbers are prime. Is there a library -- preferably wrapped or wrappable in Python -- that implements one of the more efficient primality proving algorithms?\nAlternatively, does anyone know where I can find a clear, detailed, and complete description of ECPP (or a similarly fast algorithm) that does not assume a great deal of prior knowledge?\nUpdate: I've found a Java implementation of another test, APRT-CLE, that conclusively proves primality. It verified a 291-digit prime candidate in under 10 minutes on an atom processor. Still hoping for something faster, but this seems like a promising start.\n", "output": "As an algorithm that gives a reliable polynomial primality test, consider AKS. There is an older SO article referencing implementations and presentations of the algorithm.\n"}
{"input": "Are there technical reasons a Ruby DSL like RSpec couldn't be rewritten in Python? The section below goes into more detail, but basically someone stated that the Ruby-written DSL RSpec couldn't be rewritten in Python. Is that true? If so, why? \nI'm wanting to better understand the technical differences between Ruby and Python.\nUpdate: Why am I asking this question?\nThe Running away from RSpec discussion has some statements about it being \"impossible\" to recreate RSpec in Python. I was trying to make the question a little broader in hopes of learning more of the technical differences between Ruby and Python. In hindsight, maybe I should have tightened the question's scope to just asking if it truly is impossible to recreate RSpec in Python, and if so why.\nBelow are just a few quotes from the Running away from RSpec discussion.\nInitial Question\n\nFor the past few weeks I have been thinking a lot about RSpec and why there is no clear, definite answer when someone asks:\n\n\"I'm looking for a Python equivalent of RSpec. Where can I find such a\n    thing?\"\n\nProbably the most common (and understandable) answer is that Python syntax\n  wouldn't allow such a thing whereas in Ruby it is possible.\n\nFirst Response to Initial Question\n\nNot syntax exactly. Rspec monkeypatches every object inside of its\n  scope, inserting the methods \"should\" and \"should_not\". You can do\n  something in python, but you can't monkeypatch the built-in types.\n\nAnother Response\n\nAs you suggest, it's impossible. Mote and PySpec are just fancy ways\n  to name your tests: weak implementations of one tiny corner of RSpec.\n  Mote uses horrible settrace magic; PySpec adds a bunch of\n  domain-irrelevant noise. Neither even supports arbitrary context\n  strings. RSpec is more terse, more expressive, removes the noise, and\n  is an entirely reasonable thing to build in Ruby.\nThat last point is important: it's not just that RSpec is possible in\n  Ruby; it's actually idiomatic. \n\n", "output": "If I had to point out one great difficulty for creating a Python RSpec, it would be the lack of a good syntax in Python for creating anonymous functions (as in JavaScript) or blocks (as in Ruby). The only option for a Python programmer is to use lambdas, which is not an option at all because lambdas just accept one expression. The do ... end blocks used in RSpec would have to be written as a function before calling describe and it, as in the example below:\ndef should_do_stuff():\n    # ...\nit(\"should do stuff\", should_do_stuff)\n\nNot so sexy, right?\nThere are some difficulties in creating the should methods, but I bet it would be a smaller problem. Actually, one does not even need to use such an unusual syntax\u00e2\u0080\u0094you could get similar results (maybe even better, depending on your taste) using the Jasmine syntax, which can be trivially implemented.\nThat said, I feel that Python syntax is more focused on efficiently representing the usual program components such as classes, functions, variables, etc. It is not well suited to be extended. I, for one, think that a good Python program is one where I can see objects, and functions, and variables, and I understand what each one of these elements do. Ruby programmers, OTOH, seem to seek for a more prose-like style, where a new language is defined for a new problem. It is a good way of doing things, too, but not a Pythonic way. Python is good to represent algorithms, not prose.\nSometimes it is a draconian limit. How could one use BDD for example? Well, the usual way of pushing these limits in Python is to effectively write your own DSL, but it should REALLY be another language. That is what Pyccuracy is, for example: another language for BDD. A more mainstream example is doctest. (Actually, if I would write some BDD Python library, I would write it based on doctest.) Another example of Python DSL is Twill. And yet another example is reStructuredText, used in Sphinx.\nSummarizing: IMHO the hardest barrier to DSLs in Python is the lack of a flexible syntax for creating anonymous functions. And it is not a fault*: Python is not fond of having its syntax heavily explored anyway\u00e2\u0080\u0094it is considered to make code less clear in the Python universe. If you want a new syntax in Python you are well advised to write your own language, or at least it is the way I feel.\n* Or maybe it is - I have to confess that I miss anonymous functions. However, I recognize that they would be hard to implement elegantly given the Python semantic indentation.\n"}
{"input": "Statically Typed Metaprogramming? I've been thinking about what I would miss in porting some Python code to a statically typed language such as F# or Scala; the libraries can be substituted, the conciseness is comparable, but I have lots of python code which is as follows:\n@specialclass\nclass Thing(object):\n    @specialFunc\n    def method1(arg1, arg2):\n        ...\n    @specialFunc\n    def method2(arg3, arg4, arg5):\n        ...\n\nWhere the decorators do a huge amount: replacing the methods with callable objects with state, augmenting the class with additional data and properties, etc.. Although Python allows dynamic monkey-patch metaprogramming anywhere, anytime, by anyone, I find that essentially all my metaprogramming is done in a separate \"phase\" of the program. i.e.:\nload/compile .py files\ntransform using decorators\n// maybe transform a few more times using decorators\nexecute code // no more transformations!\n\nThese phases are basically completely distinct; I do not run any application level code in the decorators, nor do I perform any ninja replace-class-with-other-class or replace-function-with-other-function in the main application code. Although the \"dynamic\"ness of the language says I can do so anywhere I want, I never go around replacing functions or redefining classes in the main application code because it gets crazy very quickly. \nI am, essentially, performing a single re-compile on the code before i start running it.\nThe only similar metapogramming i know of in statically typed languages is reflection: i.e. getting functions/classes from strings, invoking methods using argument arrays, etc. However, this basically converts the statically typed language into a dynamically typed language, losing all type safety (correct me if i'm wrong?). Ideally, I think, I would have something like the following:\nload/parse application files \nload/compile transformer\ntransform application files using transformer\ncompile\nexecute code\n\nEssentially, you would be augmenting the compilation process with arbitrary code, compiled using the normal compiler, that will perform transformations on the main application code. The point is that it essentially emulates the \"load, transform(s), execute\" workflow while strictly maintaining type safety.\nIf the application code are borked the compiler will complain, if the transformer code is borked the compiler will complain, if the transformer code compiles but doesn't do the right thing, either it will crash or the compilation step after will complain that the final types don't add up. In any case, you will never get the runtime type-errors possible by using reflection to do dynamic dispatch: it would all be statically checked at every step.\nSo my question is, is this possible? Has it already been done in some language or framework which I do not know about? Is it theoretically impossible? I'm not very familiar with compiler or formal language theory, I know it would make the compilation step turing complete and with no guarantee of termination, but it seems to me that this is what I would need to match the sort of convenient code-transformation i get in a dynamic language while maintaining static type checking.\nEDIT: One example use case would be a completely generic caching decorator. In python it would be:\ncacheDict = {}\ndef cache(func):\n    @functools.wraps(func)\n    def wrapped(*args, **kwargs):\n        cachekey = hash((args, kwargs))\n        if cachekey not in cacheDict.keys():\n            cacheDict[cachekey] = func(*args, **kwargs)\n        return cacheDict[cachekey]\n    return wrapped\n\n\n@cache\ndef expensivepurefunction(arg1, arg2):\n    # do stuff\n    return result\n\nWhile higher order functions can do some of this or objects-with-functions-inside can do some of this, AFAIK they cannot be generalized to work with any function taking an arbitrary set of parameters and returning an arbitrary type while maintaining type safety. I could do stuff like:\npublic Thingy wrap(Object O){ //this probably won't compile, but you get the idea\n    return (params Object[] args) => {\n        //check cache\n        return InvokeWithReflection(O, args)\n    }\n}\n\nBut all the casting completely kills type safety.\nEDIT: This is a simple example, where the function signature does not change. Ideally what I am looking for could modify the function signature, changing the input parameters or output type (a.l.a. function composition) while still maintaining type checking.\n", "output": "Very interesting question.\nSome points regarding metaprogramming in Scala:\n\nIn scala 2.10 there will be developments in scala reflection\nThere is work in source to source transformation (macros) which is something you are looking for: scalamacros.org\nJava has introspection (through the reflection api) but does not allow self modification. However you can use tools to support this (such as javassist). In theory you could use these tools in Scala to achieve more than introspection.\nFrom what I could understand of your development process, you separate your domain code from your decorators (or a cross cutting concern if you will) which allow to achieve modularity and code simplicity. This can be a good use for aspect oriented programming, which allows to just that. For Java theres is a library (aspectJ), however I'm dubious it will run with Scala.\n\n"}
{"input": "Use datetime.strftime() on years before 1900? (\"require year >= 1900\") I used :\nutctime = datetime.datetime(1601,1,1) + datetime.timedelta(microseconds = tup[5])\n last_visit_time = \"Last visit time:\"+ utctime.strftime('%Y-%m-%d %H:%M:%S')\nBut I have the time of 1601, so the error show:\nValueError: year=1601 is before 1900; the datetime strftime() methods require year >= 1900\nI used python2.7, how can I make it? Thanks a lot!\n", "output": "You can do the following:\n>>> utctime.isoformat()\n'1601-01-01T00:00:00.000050'\n\nNow if you want to have exactly the same format as above:\niso = utctime.isoformat()\ntokens = iso.strip().split(\"T\")\nlast_visit_time = \"Last visit time: %s %s\" % (tokens[0], tokens[1].strip().split(\".\")[0])\n\nNot that there seems to be a patch for strftime to fix this behavior here (not tested)\n"}
{"input": "How do I create a simple pdf file in python? I'm looking for a way to output a VERY simple pdf file from Python. Basically it will consist of two columns of words, one in Russian (so utf-8 characters) and the other in English.\nI've been googling for about an hour, and the packages I've found are either massive overkill (and still don't provide useful examples) such as ReportLab, or seem to assume that the only thing anyone would ever do with pdfs is concatenate several of them together (PyPdf, pdfrw).\nMaybe I'm just missing something obvious, but all the ones I've seen seem to launch into some massive discussion about taking 17 pdf files and converting them into a 60 foot wide poster with 23 panes (slight exaggeration maybe), and leave me wondering how to get the \"Hello World\" program working. Any help would be appreciated.\n", "output": "You may use wkhtmltopdf. It is a command line utility that uses Webkit to convert html to pdf.\nYou can generate your data as html and style it with css if you want, then, use wkhtmltopdf to generate the pdf file.\n"}
{"input": "Resizing numpy.memmap arrays I'm working with a bunch of large numpy arrays, and as these started to chew up too much memory lately, I wanted to replace them with numpy.memmap instances. The problem is, now and then I have to resize the arrays, and I'd preferably do that inplace. This worked quite well with ordinary arrays, but trying that on memmaps complains, that the data might be shared, and even disabling the refcheck does not help.\na = np.arange(10)\na.resize(20)\na\n>>> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])\n\na = np.memmap('bla.bin', dtype=int)\na\n>>> memmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])\n\na.resize(20, refcheck=False)\n---------------------------------------------------------------------------\nValueError                                Traceback (most recent call last)\n<ipython-input-41-f1546111a7a1> in <module>()\n----> 1 a.resize(20, refcheck=False)\n\nValueError: cannot resize this array: it does not own its data\n\nResizing the underlying mmap buffer works perfectly fine. The problem is how to reflect these changes to the array object. I've seen this workaround, but unfortunately it doesn't resize the array in place. There is also some numpy documentation about resizing mmaps, but it's clearly not working, at least with version 1.8.0. Any other ideas, how to override the inbuilt resizing checks?\n", "output": "The issue is that the flag OWNDATA is False when you create your array. You can change that by requiring the flag to be True when you create the array:\n>>> a = np.require(np.memmap('bla.bin', dtype=int), requirements=['O'])\n>>> a.shape\n(10,)\n>>> a.flags\n  C_CONTIGUOUS : True\n  F_CONTIGUOUS : True\n  OWNDATA : True\n  WRITEABLE : True\n  ALIGNED : True\n  UPDATEIFCOPY : False\n>>> a.resize(20, refcheck=False)\n>>> a.shape\n(20,)\n\nThe only caveat is that it may create the array and make a copy to be sure the requirements are met. \nEdit to address saving:\nIf you want to save the re-sized array to disk, you can save the memmap as a .npy formatted file and open as a numpy.memmap when you need to re-open it and use as a memmap:\n>>> a[9] = 1\n>>> np.save('bla.npy',a)\n>>> b = np.lib.format.open_memmap('bla.npy', dtype=int, mode='r+')\n>>> b\nmemmap([0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])\n\nEdit to offer another method:\nYou may get close to what you're looking for by re-sizing the base mmap (a.base or a._mmap, stored in uint8 format) and \"reloading\" the memmap:\n>>> a = np.memmap('bla.bin', dtype=int)\n>>> a\nmemmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])\n>>> a[3] = 7\n>>> a\nmemmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])\n>>> a.flush()\n>>> a = np.memmap('bla.bin', dtype=int)\n>>> a\nmemmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])\n>>> a.base.resize(20*8)\n>>> a.flush()\n>>> a = np.memmap('bla.bin', dtype=int)\n>>> a\nmemmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])\n\n"}
{"input": "What's the deal with Python 3.4, Unicode, different languages and Windows? Happy examples:\n#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nczech = u'Leo\u00c5\u00a1 Jan\u00c3\u00a1\u00c4\u008dek'.encode(\"utf-8\")\nprint(czech)\n\npl = u'Zdzis\u00c5\u0082aw Beksi\u00c5\u0084ski'.encode(\"utf-8\")\nprint(pl)\n\njp = u'\u00e3\u0083\u00aa\u00e3\u0083\u00b3\u00e3\u0082\u00b0 \u00e5\u00b1\u00b1\u00e6\u009d\u0091 \u00e8\u00b2\u009e\u00e5\u00ad\u0090'.encode(\"utf-8\")\nprint(jp)\n\nchinese = u'\u00e4\u00ba\u0094\u00e8\u00a1\u008c'.encode(\"utf-8\")\nprint(chinese)\n\nMIR = u'\u00d0\u009c\u00d0\u00b0\u00d1\u0088\u00d0\u00b8\u00d0\u00bd\u00d0\u00b0 \u00d0\u00b4\u00d0\u00bb\u00d1\u008f \u00d0\u0098\u00d0\u00bd\u00d0\u00b6\u00d0\u00b5\u00d0\u00bd\u00d0\u00b5\u00d1\u0080\u00d0\u00bd\u00d1\u008b\u00d1\u0085 \u00d0\u00a0\u00d0\u00b0\u00d1\u0081\u00d1\u0087\u00d1\u0091\u00d1\u0082\u00d0\u00be\u00d0\u00b2'.encode(\"utf-8\")\nprint(MIR)\n\npt = u'Minha L\u00c3\u00adngua Portuguesa: \u00c3\u00a7\u00c3\u00a1\u00c3\u00a0'.encode(\"utf-8\")\nprint(pt)\n\nUnhappy output:\n\nAnd if I print them like this:\njp = u'\u00e3\u0083\u00aa\u00e3\u0083\u00b3\u00e3\u0082\u00b0 \u00e5\u00b1\u00b1\u00e6\u009d\u0091 \u00e8\u00b2\u009e\u00e5\u00ad\u0090'\nprint(jp)\n\nI get:\n\nI've also tried the following from this question (And other alternatives that involve sys.stdout.encoding):\n#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n\nfrom __future__ import print_function\nimport sys\n\ndef safeprint(s):\n    try:\n        print(s)\n    except UnicodeEncodeError:\n        if sys.version_info >= (3,):\n            print(s.encode('utf8').decode(sys.stdout.encoding))\n        else:\n            print(s.encode('utf8'))\n\njp = u'\u00e3\u0083\u00aa\u00e3\u0083\u00b3\u00e3\u0082\u00b0 \u00e5\u00b1\u00b1\u00e6\u009d\u0091 \u00e8\u00b2\u009e\u00e5\u00ad\u0090'\nsafeprint(jp)\n\nAnd things get even more cryptic:\n\nAnd the docs were not very helpful.\nSo, what's the deal with Python 3.4, Unicode, different languages and Windows? Almost all possible examples I could find, deal with Python 2.x.\nIs there a general and cross-platform way of printing ANY Unicode character from any language in a decent and non-nasty way in Python 3.4?\nEDIT:\nI've tried typing at the terminal:\nchcp 65001\n\nTo change the code page, as proposed here and in the comments, and it did not work (Including the attempt with sys.stdout.encoding)\n", "output": "Update: Since Python 3.6, the code example that prints Unicode strings directly should just work now (even without py -mrun).\n\nPython can print text in multiple languages in Windows console whatever chcp says:\nT:\\> py -mpip install win-unicode-console\nT:\\> py -mrun your_script.py\n\nwhere your_script.py prints Unicode directly e.g.:\n#!/usr/bin/env python3\nprint('\u00c5\u00a1 \u00c3\u00a1\u00c4\u008d')      # cz\nprint('\u00c5\u0082 \u00c5\u0084')       # pl\nprint('\u00e3\u0083\u00aa\u00e3\u0083\u00b3\u00e3\u0082\u00b0')     # jp\nprint('\u00e4\u00ba\u0094\u00e8\u00a1\u008c')      # cn\nprint('\u00d1\u0088 \u00d1\u008f \u00d0\u00b6\u00d1\u0085 \u00d1\u0091') # ru\nprint('\u00c3\u00ad \u00c3\u00a7\u00c3\u00a1\u00c3\u00a0')    # pt\n\nAll you need is to configure the font in your Windows console that can display the desired characters.\nYou could also run your Python script via IDLE without installing non-stdlib modules:\nT:\\> py -midlelib -r your_script.py\n\nTo write to a file/pipe, use PYTHONIOENCODING=utf-8 as @Mark Tolonen suggested:\nT:\\> set PYTHONIOENCODING=utf-8\nT:\\> py your_script.py >output-utf8.txt \n\nOnly the last solution supports non-BMP characters such as \u00ed\u00a0\u00bd\u00ed\u00b8\u0092 (U+1F612 UNAMUSED FACE) -- py -mrun can write them but Windows console displays them as boxes even if the font supports corresponding Unicode characters (though you can copy-paste the boxes into another program, to get the characters).\n"}
{"input": "Is it possible to implement Python code-completion in TextMate? PySmell seems like a good starting point.\nI think it should be possible, PySmell's idehelper.py does a majority of the complex stuff, it should just be a case of giving it the current line, offering up the completions (the bit I am not sure about) and then replacing the line with the selected one.\n>>> import idehelper\n>>> # The path is where my PYSMELLTAGS file is located:\n>>> PYSMELLDICT = idehelper.findPYSMELLDICT(\"/Users/dbr/Desktop/pysmell/\")\n>>> options = idehelper.detectCompletionType(\"\", \"\" 1, 2, \"\", PYSMELLDICT)\n>>> completions = idehelper.findCompletions(\"proc\", PYSMELLDICT, options)\n>>> print completions\n[{'dup': '1', 'menu': 'pysmell.pysmell', 'kind': 'f', 'word': 'process', 'abbr': 'process(argList, excluded, output, verbose=False)'}]\n\nIt'll never be perfect, but it would be extremely useful (even if just for completing the stdlib modules, which should never change, so you wont have to constantly regenerate the PYSMELLTAGS file whenever you add a function)\n\nProgressing! I have the utter-basics of completion in place - barely works, but it's close..\nI ran python pysmells.py /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/*.py -O /Library/Python/2.5/site-packages/pysmell/PYSMELLTAGS\nPlace the following in a TextMate bundle script, set \"input: entire document\", \"output: insert as text\", \"activation: key equivalent: alt+esc\", \"scope selector: source.python\"\n#!/usr/bin/env python\n\nimport os\nimport sys\nfrom pysmell import idehelper\n\nCUR_WORD = os.environ.get(\"TM_CURRENT_WORD\")\n\ncur_file = os.environ.get(\"TM_FILEPATH\")\norig_source = sys.stdin.read()\nline_no = int(os.environ.get(\"TM_LINE_NUMBER\"))\ncur_col = int(os.environ.get(\"TM_LINE_INDEX\"))\n\n# PYSMELLS is currently in site-packages/pysmell/\nPYSMELLDICT = idehelper.findPYSMELLDICT(\"/Library/Python/2.5/site-packages/pysmell/blah\")\noptions = idehelper.detectCompletionType(cur_file, orig_source, line_no, cur_col, \"\", PYSMELLDICT)\ncompletions = idehelper.findCompletions(CUR_WORD, PYSMELLDICT, options)\n\nif len(completions) > 0:\n    new_word = completions[0]['word']\n    new_word = new_word.replace(CUR_WORD, \"\", 1) # remove what user has already typed\n    print new_word\n\nThen I made a new python document, typed \"import urll\" and hit alt+escape, and it completed it to \"import urllib\"!\nAs I said, it's entirely a work-in-progress, so don't use it yet..\n\nLast update:\norestis has integrated this into the PySmell project's code! Any further fiddling will happen on github\n", "output": "EDIT: I've actually took your code above and integrated into a command. It will properly show a completion list for you to choose.\nYou can grab it here: http://github.com/orestis/pysmell/tree/master (hit download and do python setup.py install). It's rough but it works. - please report any errors on http://code.google.com/p/pysmell/\n--\nHi, I'm the developer of PySmell. I also use a Mac, so if you can send me an email (contact info is in the source code) with your progress so far, I can try to integrate it :)\nOh BTW it's called PySmell - no trailing 's' :)\n"}
{"input": "Is it possible to programmatically construct a Python stack frame and start execution at an arbitrary point in the code? Is it possible to programmatically construct a stack (one or more stack frames) in CPython and start execution at an arbitrary code point?  Imagine the following scenario:\n\nYou have a workflow engine where workflows can be scripted in Python with some constructs (e.g. branching, waiting/joining) that are calls to the workflow engine.\nA blocking call, such as a wait or join sets up a listener condition in an event-dispatching engine with a persistent backing store of some sort.\nYou have a workflow script, which calls the Wait condition in the engine, waiting for some condition that will be signalled later.  This sets up the listener in the event dispatching engine.\nThe workflow script's state, relevant stack frames including the program counter (or equivalent state) are persisted - as the wait condition could occur days or months later.\nIn the interim, the workflow engine might be stopped and re-started, meaning that it must be possible to programmatically store and reconstruct the context of the workflow script.\nThe event dispatching engine fires the event that the wait condition picks up.\nThe workflow engine reads the serialised state and stack and reconstructs a thread with the stack.  It then continues execution at the point where the wait service was called.\n\nThe Question\nCan this be done with an unmodified Python interpreter?  Even better, can anyone point me to some documentation that might cover this sort of thing or an example of code that programmatically constructs a stack frame and starts execution somewhere in the middle of a block of code?\nEdit: To clarify 'unmodified python interpreter', I don't mind using the C API (is there enough information in a PyThreadState to do this?) but I don't want to go poking around the internals of the Python interpreter and having to build a modified one.\nUpdate: From some initial investigation, one can get the execution context with PyThreadState_Get().  This returns the thread state in a PyThreadState (defined in pystate.h), which has a reference to the stack frame in frame.  A stack frame is held in a struct typedef'd to PyFrameObject, which is defined in frameobject.h.  PyFrameObject has a field f_lasti (props to bobince) which has a program counter expressed as an offset from the beginning of the code block.\nThis last is sort of good news, because it means that as long as you preserve the actual compiled code block, you should be able to reconstruct locals for as many stack frames as necessary and re-start the code.  I'd say this means that it is theoretically possible without having to make a modified python interpereter, although it means that the code is still probably going to be fiddly and tightly coupled to specific versions of the interpreter.\nThe three remaining problems are: \n\nTransaction state and 'saga' rollback, which can probably be accomplished by the sort of metaclass hacking one would use to build an O/R mapper.  I did build a prototype once, so I have a fair idea of how this might be accomplished.\nRobustly serialising transaction state and arbitrary locals.  This might be accomplished by reading __locals__ (which is available from the stack frame) and programatically constructing a call to pickle.  However, I don't know what, if any, gotchas there might be here.\nVersioning and upgrade of workflows.  This is somewhat trickier, as the system is not providing any symbolic anchors for workflow nodes.  All we have is the anchor\nIn order to do this, one would have to identify the offsets of all of the entry points and map them to the new version.  Probably feasible to do manually, but I suspect it would be hard to automate.  This is probably the biggest obstacle if you want to support this capability.  \n\nUpdate 2: PyCodeObject (code.h) has a list of addr (f_lasti)-> line number mappings in PyCodeObject.co_lnotab (correct me if wrong here).  This might be used to facilitate a migration process to update workflows to a new version, as frozen instruction pointers could be mapped to the appropriate place in the new script, done in terms of the line numbers.  Still quite messy but a little more promising.\nUpdate 3: I think the answer to this might be Stackless Python.  You can suspend tasks and serialise them.  I haven't worked out whether this will also work with the stack as well.\n", "output": "The expat python bindings included in the normal Python distribution is constructing stack frames programtically. Be warned though, it relies on undocumented and private APIs.\nhttp://svn.python.org/view/python/trunk/Modules/pyexpat.c?rev=64048&view=auto\n"}
{"input": "python single configuration file I am developing a project that requires a single configuration file whose data is used by multiple modules.\nMy question is: what is the common approach to that? should i read the configuration file from each\nof my modules (files) or is there any other way to do it?\nI was thinking to have a module named config.py that reads the configuration files and whenever I need a config I do import config and then do something like config.data['teamsdir'] get the 'teamsdir' property (for example).\nresponse: opted for the conf.py approach then since it it is modular, flexible and simple\nI can just put the configuration data directly in the file, latter if i want to read from a json file a xml file or multiple sources i just change the conf.py and make sure the data is accessed the same way.\naccepted answer: chose \"Alex Martelli\" response because it was the most complete. voted up other answers because they where good and useful too.\n", "output": "I like the approach of a single config.py module whose body (when first imported) parses one or more configuration-data files and sets its own \"global variables\" appropriately -- though I'd favor config.teamdata over the round-about config.data['teamdata'] approach.\nThis assumes configuration settings are read-only once loaded (except maybe in unit-testing scenarios, where the test-code will be doing its own artificial setting of config variables to properly exercise the code-under-test) -- it basically exploits the nature of a module as the simplest Pythonic form of \"singleton\" (when you don't need subclassing or other features supported only by classes and not by modules, of course).\n\"One or more\" configuration files (e.g. first one somewhere in /etc for general default settings, then one under /usr/local for site-specific overrides thereof, then again possibly one in the user's home directory for user specific settings) is a common and useful pattern.\n"}
{"input": "Getting CPU temperature using Python? How do I retrieve the temperature of my CPU using Python?  (Assuming I'm on Linux)\n", "output": "There is a newer API (see also LWN article and Linux kernel doc) showing temperatures under e.g.\n/sys/class/thermal/thermal_zone0/temp\n\nReadings are in thousandths of degrees Celcius (although in older kernels, it may have just been degrees C).\n"}
{"input": "Which Python IDE can run my script line-by-line? I wouldn't call myself programmer, but I've started learning Python recently and really enjoy it.\nI mainly use it for small tasks so far - scripting, text processing, KML generation and ArcGIS.\nFrom my experience with R (working with excellent Notepad++ and NppToR combo) I usually try to work with my scripts line by line (or region by region) in order to understand what each step of my script is doing.. and to check results on the fly.\nMy question: is there and IDE (or editor?) for Windows that lets you evaluate single line of Python script?\nI have seen quite a lot of discussion regarding IDEs in Python context.. but havent stubled upon this specific question so far.\nThanks for help!\n", "output": "If you like R's layout. I highly recommend trying out Spyder. If you are using windows, try out Python(x,y). It is a package with a few different editors and a lot of common extra modules like scipy and numpy.\n"}
{"input": "cx_freeze python single file? I've been using cx_freeze for a while now and there is one thing I've really wanted to do: put ALL of the files into ONE executable that I can distribute. It's not really user friendly to send around a folder filled with 30 files, all in the same directory. How can I accomplish this?\nThanks.\n", "output": "It is not supported in cx_freeze. There was discussion about it on mailing list. As I recall, the author stated that others (PyInstaller, Py2Exe) use some dirty hacks to achieve that. Some anti-virus programs can consider that behavior as a virus also.\nI used PyInstaller for some time, but went back to cx_freeze because of Python 2.7 support. I pack everything using Inno Setup, so it is easy to distribute. Depends if you use Windows or not.\n"}
{"input": "Python: in-memory object database which supports indexing? I'm doing some data munging which would be quite a bit simpler if I could stick a bunch of dictionaries in an in-memory database, then run simply queries against it. \nFor example, something like:\npeople = db([\n    {\"name\": \"Joe\", \"age\": 16},\n    {\"name\": \"Jane\", \"favourite_color\": \"red\"},\n])\nover_16 = db.filter(age__gt=16)\nwith_favorite_colors = db.filter(favorite_color__exists=True)\n\nThere are three confounding factors, though: \n\nSome of the values will be Python objects, and serializing them is out of the question (too slow, breaks identity). Of course, I could work around this (eg, by storing all the items in a big list, then serializing their indexes in that list\u00e2\u0080\u00a6 But that could take a fair bit of fiddling).\nThere will be thousands of data, and I will be running lookup-heavy operations (like graph traversals) against them, so it must be possible to perform efficient (ie, indexed) queries.\nAs in the example, the data is unstructured, so systems which require me to predefine a schema would be tricky.\n\nSo, does such a thing exist? Or will I need to kludge something together?\n", "output": "What about using an in-memory SQLite database via the sqlite3 standard library module, using the special value :memory: for the connection? If you don't want to write your on SQL statements, you can always use an ORM, like SQLAlchemy, to access an in-memory SQLite database.\nEDIT: I noticed you stated that the values may be Python objects, and also that you require avoiding serialization. Requiring arbitrary Python objects be stored in a database also necessitates serialization.\nCan I propose a practical solution if you must keep those two requirements? Why not just use Python dictionaries as indices into your collection of Python dictionaries? It sounds like you will have idiosyncratic needs for building each of your indices; figure out what values you're going to query on, then write a function to generate and index for each. The possible values for one key in your list of dicts will be the keys for an index; the values of the index will be a list of dictionaries. Query the index by giving the value you're looking for as the key.\nimport collections\nimport itertools\n\ndef make_indices(dicts):\n    color_index = collections.defaultdict(list)\n    age_index = collections.defaultdict(list)\n    for d in dicts:\n        if 'favorite_color' in d:\n            color_index[d['favorite_color']].append(d)\n        if 'age' in d:\n            age_index[d['age']].append(d)\n    return color_index, age_index\n\n\ndef make_data_dicts():\n    ...\n\n\ndata_dicts = make_data_dicts()\ncolor_index, age_index = make_indices(data_dicts)\n# Query for those with a favorite color is simply values\nwith_color_dicts = list(\n        itertools.chain.from_iterable(color_index.values()))\n# Query for people over 16\nover_16 = list(\n        itertools.chain.from_iterable(\n            v for k, v in age_index.items() if age > 16)\n)\n\n"}
{"input": "Why are there no Makefiles for automation in Python projects? As a long time Python programmer, I wonder, if a central aspect of Python culture eluded me a long time: What do we do instead of Makefiles?\nMost ruby-projects I've seen (not just rails) use Rake, shortly after node.js became popular, there was cake. In many other (not only compliled languages) there are classic Make files.\nBut in Python, noone seems to need such infrastructure. I randomly picked Python projects on GitHub, and they had no automation, besides the installtion, provided by setup.py.\nWhat's the reason behind this?\nIs there nothing to automate? Do most programmers prefer to run stylechecks, tests, etc. manually?\nSome examples:\n\ndependecies sets up a virtualenv and installs the dependencies\ncheck calls the pep8 and pylint commandlinetools.\nthe test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest\nthe coffeescript task compiles all coffeescripts to minified javascript\nthe runserver task depends on dependencies and coffeescript\nthe deploy task depends on check and test and deploys the project.\nthe docs task calls sphinx with the appropiate arguments\n\nSome of them are just one or two-liners, but imho, they add up. Due to the Makefile, i don't have to remember them.\nTo clarify: I'm not looking for a Python equivalent for Rake. I'm glad with paver. I'm looking for the reasons.\n", "output": "Setuptools can automate a lot of things, and for things that aren't built-in, it's easily extensible.\n\nTo run unittests, you can use the setup.py test command after having added a test_suite argument to the setup() call. (documentation)\nDependencies  (even if not available on PyPI) can be handled by adding a install_requires/extras_require/dependency_links argument to the setup() call. (documentation)\nTo create a .deb package, you can use the stdeb module.\nFor everything else, you can add custom setup.py commands.\n\nBut I agree with S.Lott, most of the tasks you'd wish to automate (except dependencies handling maybe, it's the only one I find really useful) are tasks you don't run everyday, so there wouldn't be any real productivity improvement by automating them.\n"}
{"input": "Python: what are the nearest Linux and OSX equivalents of winsound.Beep? If one wishes to beep the speaker on Windows, Python 2 apparently provides a useful function: winsound.Beep(). The neat thing about this function is that it takes arguments specifying the exact frequency and duration of the beep. This is exactly what I want to do, except that I don't use Windows. So...\nWhat are the nearest equivalents of winsound.Beep() for Linux and OSX, bringing in as few dependencies as possible?\nPlease note that I want to be able to beep the speaker directly, not to play a sound file. Also, I need to be able to control the frequency and duration of the beep, so curses.beep() and print '\\a' won't do. Lastly, I am aware that PyGame provides extensive sound capabilities, but given that I don't require any of PyGame's other functionality, that would seem like using a sledgehammer to crack a nut (and anyway, I'm trying to do away with dependencies as far as possible).\n", "output": "I found a potential solution here:\nhttp://bytes.com/topic/python/answers/25217-beeping-under-linux\nIt involves writing directly to /dev/audio. Not sure how portable it is or if it even works at all - i'm not on a linux machine atm.\ndef beep(frequency, amplitude, duration):\n    sample = 8000\n    half_period = int(sample/frequency/2)\n    beep = chr(amplitude)*half_period+chr(0)*half_period\n    beep *= int(duration*frequency)\n    audio = file('/dev/audio', 'wb')\n    audio.write(beep)\n    audio.close()\n\n"}
{"input": "What could affect Python string comparison performance for strings over 64 characters? I'm trying to evaluate if comparing two string get slower as their length increases. My calculations suggest comparing strings should take an amortized constant time, but my Python experiments yield strange results:\nHere is a plot of string length (1 to 400) versus time in milliseconds. Automatic garbage collection is disabled, and gc.collect is run between every iteration.\n\nI'm comparing 1 million random strings each time, counting matches as follows.The process is repeated 50 times before taking the min of all measured times.\nfor index in range(COUNT):\n    if v1[index] == v2[index]:\n        matches += 1\n    else:\n        non_matches += 1\n\nWhat might account for the sudden increase around length 64?\nNote: The following snippet can be used to try to reproduce the problem assuming v1 and v2 are two lists of random strings of length n and COUNT is their length.\ntimeit.timeit(\"for i in range(COUNT): v1[i] == v2[i]\",\n  \"from __main__ import COUNT, v1, v2\", number=50)\n\nFurther note: I've made two extra tests: comparing string with is instead of == suppresses the problem completely, and the performance is about 210ms/1M comparisons.\nSince interning has been mentioned, I made sure to add a white space after each string, which should prevent interning; that doesn't change anything. Is it something else than interning then?\n", "output": "Python can 'intern' short strings; stores them in a special cache, and re-uses string objects from that cache.\nWhen then comparing strings, it'll first test if it is the same pointer (e.g. an interned string):\nif (a == b) {\n    switch (op) {\n    case Py_EQ:case Py_LE:case Py_GE:\n        result = Py_True;\n        goto out;\n// ...\n\nOnly if that pointer comparison fails does it use a size check and memcmp to compare the strings.\nInterning normally only takes place for identifiers (function names, arguments, attributes, etc.) however, not for string values created at runtime.\nAnother possible culprit is string constants; string literals used in code are stored as constants at compile time and reused throughout; again only one object is created and identity tests are faster on those.\nFor string objects that are not the same, Python tests for equal length, equal first characters then uses the memcmp() function on the internal C strings. If your strings are not interned or otherwise are reusing the same objects, all other speed characteristics come down to the memcmp() function.\n"}
{"input": "How can I train a Genetic Programming algorithm onto a variable sequence of descriptors? I am currently trying to design a Genetic Programming algorithm that analyses a sequence of characters and assigns a value to those characters. Below I have made up an example set. Every line represents a data point. The values that are trained are real-valued.\nExample:\nFor the word ABCDE the algorithm should return 1.0. \nExample dataset:\nABCDE  :  1\nABCDEF  :  10\nABCDEGH  :  3\nABCDELKA  :  50\nAASD  :  3\nThe dataset could be as large as it is needed, since this is all just made up. Lets assume the rule that the GP should figure out is not too complicated and that its explained by the data.\nWhat i would like the algorithm to do is to approximate the values from my dataset when given the input sequence. My problem now is that each sequence can consist of a different number of characters.  I would prefer not to need to write some fancy descriptors myself, if possible. \nHow can I train my GP (preferably using tinyGP or python) to build this model? \nSince there was so much discussion here - a diagram says a thousand words:\n\nWhat I want to do is just put a data point and put that into a function. Then I get a value, which is my result. Unfortunately i do not know this function, I just have a dataset that has some examples (maybe 1000 examples just an example). Now I use the Genetic Programming Algorithm to find an Algorithm that is able to convert my Datapoint into a Result. This is my model. The problem that I have in this case is that the data points are of differing lengths. For a set length I could just specify each of the characters in the string as a input parameter. But beats me what to do if I have a varying number of input parameters. \nDisclaimer: I have gotten to this problem multiple times during my studies, but we could never work out a solution that would work out well (like using a window, descriptors, etc). I would like to use a GP, because I like the technology and would like to try it out, but during Uni we also tried this with ANNs, etc, but to no avail. The problem of the variable input size remains.\n", "output": "Since you have not a fitness function, you will need to treat the genetic algorithm as it was a classifier. So you will need to come up with a way to evaluate a single chromosome. As others suggested you, this is a pure classification problem, not an optimization one, but, if you still want to go ahead with GA, here you have some steps to try an initial approach:\nYou will need:\n\nDescription of (how to encode) a valid chromosome\n\nTo work with genetic algorithms, all the solutions must have same length (there are more advanced approach with variable length enconding, but I wont enter there). So, having that, you will need to find an optimal encode method. Knowing that your input is a variable length string, you can encode your chromosome as a lookup table (dictionary in python) for your alphabet. However, a dictionary will give you some problems when you try to apply crossover or mutation operations, so is better to have the alphabet and chromosome encoding splitted. Refering to language models, you can check for n-grams, and your chromosome will have the same length as the length of your alphabet:\n.. Unigrams\nalphabet = \"ABCDE\"\nchromosome1 = [1, 2, 3, 4, 5]\nchromosome2 = [1, 1, 2, 1, 0]\n\n.. Bigrams\nalphabet = [\"AB\", \"AC\", \"AD\", \"AE\", \"BC\", \"BD\", \"BE\", \"CD\", \"CE\", \"DE\"]\nchromosome = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]\n\n.. Trigrams\nalphabet = [\"ABC\", \"ABD\", \"ABE\"...]\nchromosome = as above, a value for each combination\n\n2.\n Decode a chromosome to evaluate a single input\nYour chromosome will represent an integer values for each element in your alphabet. So if you want to know a value of one of your inputs (variable length string) having a chromosome, you will need to try some evaluation functions, simplest one is the sum of each letter value. \nalphabet = \"ABC\"\nchromosome = [1, 2, 1]\ninput = \"ABBBC\"\n\n# acc = accumulated value\nvalue = reduce(lambda acc, x: acc + chromosme[alphabet.index(x)], input, 0)\n# Will return ABBBC = 1+2+2+2+1 = 8\n\n3.\n Fitness function\nYour fitness function is just a simple error function. You can use simple error sum, square error... A simple evaluation function for a single gen:\ndef fitnessFunction(inputs, results, alphabet, chromosome):\n    error = 0\n\n    for i in range(len(inputs)):\n        value = reduce(lambda acc, x: acc + chromosome[alphabet.index(x)], inputs[i], 0) \n        diff = abs(results[i] - value)\n        error += diff # or diff**2 if you want squared error\n\n    return error\n\n# A simple call -> INPUTS, EXPECTED RESULTS, ALPHABET, CURRENT CHROMOSOME\nfitnessFunction([\"ABC\", \"ABB\", \"ABBC\"], [1,2,3], \"ABC\", [1, 1, 0])\n# returned error will be:\n# A+B+C = 1 + 1 + 0 -- expected value = 1 --> error += 1\n# A+B+B = 1 + 1 + 1 -- expected value = 2 --> error += 1\n# A+B+C = 1 + 1 + 1 + 0 -- expected value = 3 --> error += 0\n# This chromosome has error of 2\n\nNow, using any crossover and mutation operator you want (e.g.: one point crossover and bit flip mutation), find the chromosome that minimizes that error.\nThings you can try to improve the algorithm model:\n\nUsing bigrams or trigrams\nChange evaluate method (currently is a sum of lookup table values, it can be a product or something more complex)\nTry using real values in chromosomes, instead of just integers\n\n"}
{"input": "Python file open function modes I have noticed that, in addition to the documented mode characters, Python 2.7.5.1 in Windows XP and 8.1 also accepts modes U and D at least when reading files. Mode U is used in numpy's genfromtxt. Mode D has the effect that the file is deleted, as per the following code fragment:\n f = open('text.txt','rD')\n print(f.next())\n f.close()  # file text.txt is deleted when closed\n\nDoes anybody know more about these modes, especially whether they are a permanent feature of the language applicable also on Linux systems?\n", "output": "The D flag seems to be Windows specific. Windows seems to add several flags to the fopen function in its CRT, as described here.\nWhile Python does filter the mode string to make sure no errors arise from it, it does allow some of the special flags, as can be seen in the Python sources here. Specifically, it seems that the N flag is filtered out, while the T and D flags are allowed:\nwhile (*++mode) {\n    if (*mode == ' ' || *mode == 'N') /* ignore spaces and N */\n        continue;\n    s = \"+TD\"; /* each of this can appear only once */\n    ...\n\nI would suggest sticking to the documented options to keep the code cross-platform.\n"}
{"input": "Java equivalent of Python repr()? Is there a Java method that works like Python's repr? For example, assuming the function were named repr,\n\"foo\\n\\tbar\".repr()\n\nwould return\n\"foo\\n\\tbar\"\nnot\nfoo\n        bar\nas toString does.\n", "output": "In some projects, I use the following helper function to accomplish something akin to Python's repr for strings:\nprivate static final char CONTROL_LIMIT = ' ';\nprivate static final char PRINTABLE_LIMIT = '\\u007e';\nprivate static final char[] HEX_DIGITS = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };\n\npublic static String toPrintableRepresentation(String source) {\n\n    if( source == null ) return null;\n    else {\n\n        final StringBuilder sb = new StringBuilder();\n        final int limit = source.length();\n        char[] hexbuf = null;\n\n        int pointer = 0;\n\n        sb.append('\"');\n\n        while( pointer < limit ) {\n\n            int ch = source.charAt(pointer++);\n\n            switch( ch ) {\n\n            case '\\0': sb.append(\"\\\\0\"); break;\n            case '\\t': sb.append(\"\\\\t\"); break;\n            case '\\n': sb.append(\"\\\\n\"); break;\n            case '\\r': sb.append(\"\\\\r\"); break;\n            case '\\\"': sb.append(\"\\\\\\\"\"); break;\n            case '\\\\': sb.append(\"\\\\\\\\\"); break;\n\n            default:\n                if( CONTROL_LIMIT <= ch && ch <= PRINTABLE_LIMIT ) sb.append((char)ch);\n                else {\n\n                    sb.append(\"\\\\u\");\n\n                    if( hexbuf == null ) \n                        hexbuf = new char[4];\n\n                    for( int offs = 4; offs > 0; ) {\n\n                        hexbuf[--offs] = HEX_DIGITS[ch & 0xf];\n                        ch >>>= 4; \n                    }\n\n                    sb.append(hexbuf, 0, 4);\n                }\n            }\n        }\n\n        return sb.append('\"').toString();\n    }\n}\n\nIts main advantage over many of the other solutions given here is, that it does not filter only a limited set of non-printable characters (like those replace-based solutions), but simply all non-printable ASCII characters. Some of it could have been written slightly nicer, but it actually does its job...\nNote, that like the Python function, this one will surround the string with quotes. If you do not want that, you will have to eliminate the append('\"') calls before and after the while loop.\n"}
{"input": "The \"right\" way to add python scripting to a non-python application I'm currently in the process of adding the ability for users to extend the functionality of my desktop application (C++) using plugins scripted in python.\nThe naive method is easy enough. Embed the python static library and follow any number of the dozens of tutorials scattered around the web describing how to initialize and call python files, and you're pretty much done.\nHowever...\nWhat I'm looking for is more like what Blender does. Blender is completely customizable through python scripts, and it requires an external python executable. (Ie. python isn't actually embedded in the blender executable at all.) So, naturally, you can include any modules you already have in your site-packages directory when you are writing blender scripts. Not that that's advised, since that would limit the portability of your script.\nSo, what I want to know is if there is already a way to have your cake and eat it too. I want a plugin system that uses:\n\nAn embedded python interpreter.\nThe downside of Blender's approach is that it forces you to have a specific, possibly outdated version of python installed globally on your system. Having an embedded interpreter allows me to control what version of python is being used.\nFirewall plugins.\nSome equivalent of a virtualenv for each plugin; allowing them to install all the modules they need or want, but keeping them seperated from possible conflicts in other plugins. Maybe zc.buildout is a better candidate here, but, again, I'm very open to suggestion. I'm a bit at a loss as to the best way to accomplish this.\nAs painless as possible...\nFor the user. I'm willing to go the extra mile, just so long as most of the above is as transparent to the plugin writer as possible.\n\n\nIf any of you folks out there have any experience with this sort of thing, your help would be much appreciated. :)\n\nEdit:\nBasically, the short version of what I want is the simplicity of virtualenv, but without the bundled python interpreter, and a way to activate a specific \"virtual environment\" programmatically, like zc.buildout does with sys.path manipulation (the sys.path[0:0] = [...] trick).\nBoth virtualenv and zc.buildout contain portions of what I want, but neither produce relocatable builds that I, or a plugin developer can simply zip up and send to another computer.\nSimply manipulating .pth files, or manipulating sys.path directly in a script, executed from my application gets me half-way there. But it is not enough when compiled modules are necessary, such as the PIL.\n", "output": "One effective way to accomplish this is to use a message-passing/communicating processes architecture, allowing you to accomplish your goal with Python, but not limiting yourself to Python.\n------------------------------------\n| App  <--> Ext. API <--> Protocol | <--> (Socket) <--> API.py <--> Script\n------------------------------------\n\nThis diagram attempts to show the following: Your application communicates with external processes (for example Python) using message passing. This is efficient on a local machine, and can be portable because you define your own protocol. The only thing that you have to give your users is a Python library that implements your custom API, and communicates using a Send-Receive communication loop between your user's script and your application.\nDefine Your Application's External API\nYour application's external API describes all of the functions that an external process must be able to interact with. For example, if you wish for your Python script to be able to draw a red circle in your application, your external API may include Draw(Object, Color, Position).\nDefine A Communication Protocol\nThis is the protocol that external processes use to communicate with your application through it's external API. Popular choices for this might be XML-RPC, SunRPC, JSON, or your own custom protocol and data format. The choice here needs to be sufficient for your API. For example, if you are going to be transferring binary data then JSON might require base64 encoding, while SunRPC assumes binary communication.\nBuild Your Application's Messaging System\nThis is as simple as an infinite loop receiving messages in your communication protocol, servicing the request within your application, and replying over the same socket/channel. For example, if you chose JSON then you would receive a message containing instructions to execute Draw(Object, Color, Position). After executing the request, you would reply to the request.\nBuild A Messaging Library For Python (or whatever else)\nThis is even simpler. Again, this is a loop sending and receiving messages on behalf the library user (i.e. your users writing Python scripts). The only thing this library must do is provide a programmatic interface to your Application's External API and translate requests into your communication protocol, all hidden from your users.\nUsing Unix Sockets, for example, will be extremely fast.\nPlugin/Application Rendezvous\nA common practice for discovering application plugins is to specify a \"well known\" directory where plugins should be placed. This might be, for example:\n~/.myapp/plugins\n\nThe next step is for your application to look into this directory for plugins that exist. Your application should have a some smarts to be able to distinguish between Python scripts that are, and are not, real scripts for your application.\nLet's assume that your communication protocol specifies that each script will communicate using JSON over StdInput/StdOuput. A simple, effective approach is to specify in your protocol that the first time a script runs it sends a MAGIC_ID to standard out. That is, your application reads the first, say, 8 bytes, and looks for a specific 64-bit value that identifies it as a script.\nAdditionally, you should include in your External API methods that allow your scripts to identify themselves. For example, a script should be able to inform the application through the External API things such as Name, Description, Capabilities, Expectations, essentially informing the application what it is, and what it will be doing.\n"}
{"input": "How should I do rapid GUI development for R and Octave methods (possibly with Python)? We are a medium-sized academic research lab whose main outputs are new statistical methods for analyzing large datasets.  We generally develop in R and MATLAB/Octave.\nWe would like to expand the reach of our work by building simple, wizard-style user interfaces to access our methods, either web-apps like RNAfold or stand-alone applications to analyze private data.\nIdeally, we would like the interfaces to do some data checking, to only use FOSS, to run in Mac and Windows environments, and to be able to generate simple charts and graphs that can be output as figures suitable for publication.  Also, we like Python because it\u00e2\u0080\u0099s a popular language in the lab and in our research community. \nHowever, we want to be able to develop and release quickly and cheaply.  We are lucky to be able to fund one developer in the lab and s/he has to support multiple projects.\nThere are a lot of groups with the same needs and constraints as us, so it would be useful to be able to develop a consistent long-term strategy for this type of challenge.\nEdit\nI asked for comments on four possible approaches, included below with summary of comments:\n\nCalling the R and Octave methods from Python.\nAnswers:\n\nR / Python integration\n\nRpy  (from ChrisC)\npyRserve (from radek) \nPypeR  (from radek, see answer for paper link)\n\nUsing a Python client to call in to an Octave server (from ChrisC)\n\nRe-implementing our methods in Python.  In general, this is not easy because our methods rely on other methods also implemented in R and Octave.\nAnswers:\n\nThere was no comment on this approach, it's probably a bad idea.\n\nUsing a different framework to build user-interfaces.\nAnswers:\n\nHow about building GUIs and web-apps directly in R? (from hadley, see answer for tool refs)\n\nOther ideas that we haven\u00e2\u0080\u0099t considered, e.g., integrating with workflow management systems like Taverna.\nAnswers:\n\nTry a visual programming interface like RedR (from chrisamiller)\n\n\nMy preferred answer\nI'm first going to look into the approach suggested by Spacedman: using  QtDesigner to build the UI for PyQt4 and calling R with RPy.  We get a lot of students without a programming background who want to volunteer in the lab, and QtDesigner looks like something that we could train them to use effectively.\nI would like develop GUIs directly in R as recommended by hadley but RQt, which I presume that I need to use QtDesigner, does not appear to be in active development (last update was summer of 2009).\nAlso, it looks like we may have to start implementing our methods exclusively in R, and translating over from MATLAB/Octave to R.  In my field, this is probably not a bad idea anyways.\nAppendix: \n\nPython libraries to simplify user interface construction:\n\nwxPython (ChrisC)  \nPyQt4 (from Spacedman)\n\nOther projects that integrate Python, R, and Octave:\n\nSAGE (Dirk Eddelbuettel)\nShogun (Dirk Eddelbuettel)\n\nGUIs to make GUIs:\n\nQtDesigner with PyQt4 (from Spacedman).  Note Qt4 is both commercial and available under a LGPL.\n\n\n", "output": "Why not continue to develop directly in R?  There are a number of packages that allow you to develop GUIs (gWidgets RGtk, tcl/tk, RQt, Rwxwidgets, rjava) or web applications.\n"}
{"input": "Importing Python classes from different files in a subdirectory Here's the structure I'm working with.\ndirectory/\n          script.py\n          subdir/\n                 __init__.py\n                 myclass01.py\n                 myclass02.py\n\nWhat I want to do is import in script.py the classes defined in myclass01.py and myclass02.py. If I do:\nfrom subdir.myclass01 import *\n\nIt works fine for the class defined in myclass01.py. But with this solution if there are many classes defined in different files in subdir and I want to import all of them, I'd have to type one line for each file. There must be a shortcut for this. I tried:\nfrom subdir.* import *\n\nBut it didn't work out.\nEDIT: here are the contents of the files:\nThis is __init__.py (using __all__ as Apalala suggested):\n__all__ = ['MyClass01','MyClass02']\n\nThis is myclass01.py:\nclass MyClass01:\n    def printsomething():\n        print 'hey'\n\nThis is myclass02.py:\nclass MyClass02:\n    def printsomething():\n        print 'sup'\n\nThis is script.py:\nfrom subdir import *\nMyClass01().printsomething()\nMyClass02().printsomething()\n\nThis is the traceback that I get when I try to run script.py:\nFile \"script.py\", line 1, in <module>\n    from subdir import *\nAttributeError: 'module' object has no attribute 'MyClass01'\n\n", "output": "Although the names used there are different from what's shown in your question's directory structure, you could use my answer to the question titled Python namespacing and classes. The __init__.py shown there would have also allowed the usepackage.py script to have been written this way (package maps to subdir in your question, and Class1 to myclass01, etc):\nfrom package import *\n\nprint Class1\nprint Class2\nprint Class3\n\nRevision (updated):\nOops, sorry, the code in my other answer doesn't quite do what you want \u00e2\u0080\u0094 it only automatically imports the names of any package submodules. To make it also import the named attributes from each submodule requires a few more lines of code. Here's a modified version of the package's __init__.py file (which also works in Python 3.4.1):\ndef _import_package_files():\n    \"\"\" Dynamically import all the public attributes of the python modules\n        in this file's directory (the package directory) and return a list\n        of their names.\n    \"\"\"\n    import os\n\n    exports = []\n    globals_ = globals()\n    package_path = os.path.dirname(__file__)\n    package_name = os.path.basename(package_path)\n\n    for filename in os.listdir(package_path):\n        modulename, ext = os.path.splitext(filename)\n        if modulename[0] != '_' and ext in ('.py', '.pyw'):\n            # create a package relative subpackage name\n            subpackage = '{}.{}'.format(package_name, modulename)\n            module = __import__(subpackage, globals_, locals_, [modulename])\n            modict = module.__dict__\n            names = (modict['__all__'] if '__all__' in modict else\n                     [name for name in modict if name[0] != '_'])  # public names\n            exports.extend(names)\n            globals_.update((name, modict[name]) for name in names)\n\n    return exports\n\nif __name__ != '__main__':\n    __all__ = ['__all__'] + _import_package_files()  # '__all__' in __all__\n\nAlternatively you can put the above into a separate .py module file of its own in the package directory, and use it from the package's __init__.py like this:\nif __name__ != '__main__':\n    from ._import_package_files import *  # defines __all__\n    __all__.remove('__all__')  # prevent export (optional)\n\nWhatever you name the file, it should be something that starts with an _ underscore character so it doesn't try to import itself recursively.\n"}
{"input": "Need to avoid subprocess deadlock without communicate I need a execute a command that produces a lot of output and takes a lot of time to execute (> 30 minutes). I was thinking of using subprocess.Popen to do it. I need to capture the output of the command, so I pass PIPE to stdout and stderr.\nA deadlock problem when using Popen.wait() is well documented on a lot of forums, so Popen.communicate() is the proposed way of avoiding the deadlock. The problem with that solution is that communicate() blocks until the command is completed. I need to print everything that arrives on stdout while the command is executed. If there is no output after 20 minutes, the script execution will be killed.\nHere are some constraints that I need to respect:\n\nMy Python version is 2.4.2 and I can't upgrade.\nIf the solution is still to use subprocess, I need to pass subprocess.PIPE to all std handles to avoid this bug: http://bugs.python.org/issue1124861\n\nIs there a way to do it?\n", "output": "\nto workaround python bug #1124861 on Python2.4 you could attach stdin to a NUL device\n\n\nimport os\nfrom subprocess import PIPE, STDOUT, Popen\n\nlines = []\np = Popen(cmd, bufsize=1, stdin=open(os.devnull), stdout=PIPE, stderr=STDOUT)\nfor line in iter(p.stdout.readline, ''):\n      print line,          # print to stdout immediately\n      lines.append(line)   # capture for later\np.stdout.close()\np.wait()\n\n"}
{"input": "Stop processing Flask route if request aborted I have a flask REST endpoint that does some cpu-intensive image processing and takes a few seconds to return. Often, this endpoint gets called, then aborted by the client. In these situations I would like to cancel processing. How can I do this in flask?\nIn node.js, I would do something like:\nreq.on('close', function(){\n  //some handler\n});\n\nI was expecting flask to have something similar, or a synchronous method (request.isClosed()) that I could check at certain points during my processing and return if it's closed, but I can't find one.\nI thought about sending something to test that the connection is still open, and catching the exception if it fails, but it seems Flask buffers all outputs so the exception isn't thrown until the processing completes and tries to return the result:\n\nAn established connection was aborted by the software in your host machine\n\nHow can I cancel my processing half way through if the client aborts their request?\n", "output": "There is a potentially... hacky solution to your problem. Flask has the ability to stream content back to the user via a generator. The hacky part would be streaming blank data as a check to see if the connection is still open and then when your content is finished the generator could produce the actual image. Your generator could check to see if processing is done and return None or \"\" or whatever if it's not finished. \nfrom flask import Response\n\n@app.route('/image')\ndef generate_large_image():\n    def generate():\n        while True:\n            if not processing_finished():\n                yield \"\"\n            else:\n                yield get_image()\n    return Response(generate(), mimetype='image/jpeg')\n\nI don't know what exception you'll get if the client closes the connection but I'm willing to bet its error: [Errno 32] Broken pipe\n"}
{"input": "Charts in django Web Applications I want to Embed a chart in a Web Application developed using django.\nI have come across Google charts API, ReportLab, PyChart, MatPlotLib and ChartDirector\nI want to do it in the server side rather than send the AJAX request to Google chart APIs, as I also want to embed the chart into the PDF.\nWhich is the best option to use, and what are the relative merits and demerits of one over the other.\n", "output": "Another choice is CairoPlot.\nWe picked matplotlib over the others for some serious graphing inside one of our django apps, primarily because it was the only one that gave us exactly the kind of control we needed.\nPerformance generating PNG's was fine for us but... it was a highly specialized app with less than 10 logins a day.\n"}
{"input": "Compile Matplotlib for Python on Snow Leopard I've killed half a day trying to compile matplotlib for python on Snow Leopard.  I've used the googles and found this helpful page (http://blog.hyperjeff.net/?p=160) but I still can't get it to compile.  I see comments from other users on that page, so I know I'm not alone.\nI already installed zlib, libpng and freetype independently.\nI edited the make.osx file to contain this at the top:\nPREFIX=/usr/local\n\nPYVERSION=2.6\nPYTHON=python${PYVERSION}\nZLIBVERSION=1.2.3\nPNGVERSION=1.2.33\nFREETYPEVERSION=2.3.5\nMACOSX_DEPLOYMENT_TARGET=10.6\n\n## You shouldn't need to configure past this point\n\nPKG_CONFIG_PATH=\"${PREFIX}/lib/pkgconfig\"\nCFLAGS=\"-Os -arch x86_64 -arch i386 -I${PREFIX}/include\"\nLDFLAGS=\"-arch x86_64 -arch i386 -L${PREFIX}/lib\"\nCFLAGS_DEPS=\"-arch i386 -arch x86_64 -I${PREFIX}/include -I${PREFIX}/include/freetype2 -isysroot /Developer/SDKs/MacOSX10.6.sdk\"\nLDFLAGS_DEPS=\"-arch i386 -arch x86_64 -L${PREFIX}/lib -syslibroot,/Developer/SDKs/MacOSX10.6.sdk\"\n\nI then run:\nsudo make -f make.osx mpl_build\n\nwhich gives me:\nexport PKG_CONFIG_PATH=\"/usr/local/lib/pkgconfig\" &&\\\n    export MACOSX_DEPLOYMENT_TARGET=10.6 &&\\\n    export CFLAGS=\"-Os -arch x86_64 -arch i386 -I/usr/local/include\" &&\\\n    export LDFLAGS=\"-arch x86_64 -arch i386 -L/usr/local/lib\" &&\\\n    python2.6 setup.py build\n\n... snip ...\n\ngcc-4.2 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -Os -arch x86_64 -arch i386 -I/usr/local/include -pipe -DPY_ARRAYAUNIQUE_SYMBOL=MPL_ARRAY_API -I/Library/Python/2.6/site-packages/numpy/core/include -I. -I/Library/Python/2.6/site-packages/numpy/core/include/freetype2 -I./freetype2 -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c src/ft2font.cpp -o build/temp.macosx-10.6-universal-2.6/src/ft2font.o\ncc1plus: warning: command line option \"-Wstrict-prototypes\" is valid for C/ObjC but not for C++\nIn file included from src/ft2font.h:13,\n                 from src/ft2font.cpp:1:\n/usr/local/include/ft2build.h:56:38: error: freetype/config/ftheader.h: No such file or directory\n\n... snip ...\n\nsrc/ft2font.cpp:98: error: \u00e2\u0080\u0098FT_Int\u00e2\u0080\u0099 was not declared in this scope\n/Library/Python/2.6/site-packages/numpy/core/include/numpy/__multiarray_api.h:1174: warning: \u00e2\u0080\u0098int _import_array()\u00e2\u0080\u0099 defined but not used\nlipo: can't open input file: /var/tmp//ccDOGx37.out (No such file or directory)\nerror: command 'gcc-4.2' failed with exit status 1\nmake: *** [mpl_build] Error 1\n\nI'm just lost.\n", "output": "According to your error message you have missing freetype headers. Can you locate them using system search functionalities. I will not lecture on using a pre-built package since I love scratching my head and compiling from the start as well.\n"}
{"input": "How to setup and launch a Scrapy spider programmatically (urls and settings) I've written a working crawler using scrapy,\nnow I want to control it through a Django webapp, that is to say:  \n\nSet 1 or several start_urls\nSet 1 or several allowed_domains\nSet settings values\nStart the spider \nStop / pause / resume a spider\nretrieve some stats while running\nretrive some stats after spider is complete.\n\nAt first I thought scrapyd was made for this, but after reading the doc, it seems that it's more a daemon able to manage 'packaged spiders', aka 'scrapy eggs'; and that all the settings (start_urls , allowed_domains, settings ) must still be hardcoded in the 'scrapy egg' itself ; so it doesn't look like a solution to my question, unless I missed something. \nI also looked at this question : How to give URL to scrapy for crawling? ; \nBut the best answer to provide multiple urls is qualified by the author himeslf as an 'ugly hack', involving some python subprocess and complex shell handling, so I don't think the solution is to be found here. Also, it may work for start_urls, but it doesn't seem to allow allowed_domains or settings.\nThen I gave a look to scrapy webservices :\nIt seems to be the good solution for retrieving stats. However, it still requires a running spider, and no clue to change settings\nThere are a several questions on this subject, none of them seems satisfactory:\n\nusing-one-scrapy-spider-for-several-websites\nThis one seems outdated, as scrapy has evolved a lot since 0.7\ncreating-a-generic-scrapy-spider\nNo accepted answer, still talking around tweaking shell parameters.\n\nI know that scrapy is used in production environments ; and a tool like scrapyd shows that there are definitvely some ways to handle these requirements (I can't imagine that the scrapy eggs scrapyd is dealing with are generated by hand !)  \nThanks a lot for your help.\n", "output": "\nAt first I thought scrapyd was made for this, but after reading the doc, it seems that it's more a daemon able to manage 'packaged spiders', aka 'scrapy eggs'; and that all the settings (start_urls , allowed_domains, settings ) must still be hardcoded in the 'scrapy egg' itself ; so it doesn't look like a solution to my question, unless I missed something. \n\nI don't agree to the above statement, start_urls need not be hard-coded they can be dynamically passed to the class, you should be able to pass it as an argument like this\nhttp://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1\n\nOr you should be able to retrieve the URLs from a database or a file. I get it from a database like this\nclass WikipediaSpider(BaseSpider):\n    name = 'wikipedia'\n    allowed_domains = ['wikipedia.com']\n    start_urls = []\n\n    def __init__(self, name=None, url=None, **kwargs):\n        item = MovieItem()\n        item['spider'] = self.name\n        # You can pass a specific url to retrieve \n        if url:\n            if name is not None:\n                self.name = name\n            elif not getattr(self, 'name', None):\n                raise ValueError(\"%s must have a name\" % type(self).__name__)\n            self.__dict__.update(kwargs)\n            self.start_urls = [url]\n        else:\n            # If there is no specific URL get it from Database\n            wikiliks = # < -- CODE TO RETRIEVE THE LINKS FROM DB -->\n            if wikiliks == None:\n                print \"**************************************\"\n                print \"No Links to Query\"\n                print \"**************************************\"\n                return None\n\n            for link in wikiliks:\n                # SOME PROCESSING ON THE LINK GOES HERE\n                self.start_urls.append(urllib.unquote_plus(link[0]))\n\n    def parse(self, response):\n        hxs = HtmlXPathSelector(response)\n        # Remaining parse code goes here\n\n"}
{"input": "Why does map return a map object instead of a list in Python 3? I am interested in understanding the new language design of Python 3.x.\nI do enjoy, in Python 2.7, the function map:\nPython 2.7.12\nIn[2]: map(lambda x: x+1, [1,2,3])\nOut[2]: [2, 3, 4]\n\nHowever, in Python 3.x things have changed:\nPython 3.5.1\nIn[2]: map(lambda x: x+1, [1,2,3])\nOut[2]: <map at 0x4218390>\n\nI understand the how, but I could not find a reference to the why. Why did the language designers make this choice, which, in my opinion, introduces a great deal of pain. Was this to arm-wrestle developers in sticking to list comprehensions?\nIMO, list can be naturally thought as Functors; and I have been somehow been thought to think in this way:\nfmap :: (a -> b) -> f a -> f b\n\n", "output": "Because it returns an iterator, it omit storing the full list in the memory. So that you can easily iterate over it in the future not making pain to memory. Possibly you even don't need a full list, but the part of it, until your condition is matched.\nYou can find this docs useful, because iterators are awesome.\n\nAn object representing a stream of data. Repeated calls to the iterator\u00e2\u0080\u0099s __next__() method (or passing it to the built-in function next()) return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__() method just raise StopIteration again. Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container. \n\n"}
{"input": "How do you add breakpoints to a python program in IDLE? I solved this already, I'm just posting it here because I couldn't figure it out with google, or by reading the docs. I only found it by accident.\nTo add a breakpoint, right-click on the line that you want to add a breakpoint on and select \"Set Breakpoint\". The line will become highlighted. (Note that this only works when you are editing a file; it does not work at the interpreter.)\nOn Mac, you need to Control-click instead of right clicking.\n", "output": "There is an open Python issue about the lack of documentation for IDLE breakpoints.  Contributions welcome!\nUpdate: The on-line documentation and the IDLE help files (for 2.7.4, 3.2.4, and 3.3.0) have been updated, with Nick's help.\n"}
{"input": "Is there \"Edit and Continue\" in PyCharm? Reload code into running program like in Eclipse / PyDev? Hi all Python developers!\nIn Eclipse with PyDev it is possible to edit a Python file while debugging. On save, the PyDev debugger will reload the updated code into the running program and uses my new code. How can I do the same thing in JetBrains PyCharm (using Community Edition)? \nEclipse / PyDev writes an output like this when I do that:\npydev debugger: Start reloading module: \"MyWidget\" ... \npydev debugger: Updated function code: <function close at 0x055F4E70>\npydev debugger: reload finished\n\nI searched settings and web and could not find any hint. Very glad about any idea. Thx.\nEdit: I found out in Eclipse/PyDev one has to be in debug mode to be able to use this feature. I tested in PyCharm, but there was no reload done.\n", "output": "After all I found a useful and acceptable workaround for my question. It works in PyCharm Community Edition 3.1.2 and I assume it will do in commercial edition as well. I tested on a mid-scale project using Python 2.7.6, PySide (Qt) with one main window and 20+ widgets, tabs, whatever. Follow these steps...\n\nWork in Pycharm on a python project :-)\nExecute your code in Debug mode (did not tried Release so far)\nEdit some code in one your modules imported during the life of your program\nMake your program pause. To achieve this, you can click the \"Pause\" button of in PyCharms Debug view and then any place in your applications main window where it would need to do something (for example on a tab header). If you have a long a running task and no UI, you may place a breakpoint in a place your program often comes by.\nIn the Debug view, switch to the Console tab. There is a button on the left Show command line. Click this.\nIn the console, type in reload(MyModifiedModule) if this call fails, write import MyModifiedModule and try again.\nClick resume in PyCharm.\nTry the code you fixed.\n\n\nThere are some restrictions on this... It won't fix changes in your main method or main window, cause it won't be created again. In my tests I could not reload widgets from Qt. But it worked for classes like data containers or workers.\nMay the force be with you as you try this and do not hesitate to add your experiences.\nThank you.\n"}
{"input": "Right way to return proxy model instance from a base model instance in Django? Say I have models:\nclass Animal(models.Model):\n    type = models.CharField(max_length=255)\n\nclass Dog(Animal):\n    def make_sound(self):\n        print \"Woof!\"\n    class Meta:\n        proxy = True\n\nclass Cat(Animal):\n    def make_sound(self):\n        print \"Meow!\"\n    class Meta:\n        proxy = True\n\nLet's say I want to do:\n animals = Animal.objects.all()\n for animal in animals:\n     animal.make_sound()\n\nI want to get back a series of Woofs and Meows. Clearly, I could just define a make_sound in the original model that forks based on animal_type, but then every time I add a new animal type (imagine they're in different apps), I'd have to go in and edit that make_sound function. I'd rather just define proxy models and have them define the behavior themselves. From what I can tell, there's no way of returning mixed Cat or Dog instances, but I figured maybe I could define a \"get_proxy_model\" method on the main class that returns a cat or a dog model.\nSurely you could do this, and pass something like the primary key and then just do Cat.objects.get(pk = passed_in_primary_key). But that'd mean doing an extra query for data you already have which seems redundant. Is there any way to turn an animal into a cat or a dog instance in an efficient way? What's the right way to do what I want to achieve?\n", "output": "The Metaclass approach proposed by thedk is indeed a very powerful way to go, however, I had to combine it with an answer to the question here to have the query return a proxy model instance. The simplified version of the code adapted to the previous example would be:\nfrom django.db.models.base import ModelBase\n\nclass InheritanceMetaclass(ModelBase):\n    def __call__(cls, *args, **kwargs):\n        obj = super(InheritanceMetaclass, cls).__call__(*args, **kwargs)\n        return obj.get_object()\n\nclass Animal(models.Model):\n    __metaclass__ = InheritanceMetaclass\n    type = models.CharField(max_length=255)\n    object_class = models.CharField(max_length=20)\n\n    def save(self, *args, **kwargs):\n        if not self.object_class:\n            self.object_class = self._meta.module_name\n        super(Animal, self).save( *args, **kwargs)\n\n    def get_object(self):\n        if self.object_class in SUBCLASSES_OF_ANIMAL:\n            self.__class__ = SUBCLASSES_OF_ANIMAL[self.object_class]\n        return self\n\nclass Dog(Animal):\n    class Meta:\n        proxy = True\n    def make_sound(self):\n        print \"Woof!\"\n\n\nclass Cat(Animal):\n    class Meta:\n        proxy = True\n    def make_sound(self):\n        print \"Meow!\"\n\n\nSUBCLASSES_OF_ANIMAL = dict([(cls.__name__, cls) for cls in ANIMAL.__subclasses__()])\n\nThe advantage of this proxy approach is that no db migration is required upon creation of new subclasses. The drawback is that no specific fields can be added to the subclasses.\nI would be happy to have feedback on this approach. \n"}
{"input": "Use pdb.set_trace() in a script that reads stdin via a pipe I have a python script that reads stdin via a pipe, and I cannot seem to use it with pdb.set_trace().  \nmy_script.py:\n#!/usr/bin/env python\nimport sys\nimport pdb\n\ndef main():\n    for line in sys.stdin:\n        print \"Printing a line: \" +line\n\nif __name__=='__main__':\n    status = main()\n\nSuppose tempfile.csv is some file with two lines,\n$ cat tempfile.csv \nline1\nline2\n\nthen I can run my script with:\n$ cat tempfile.csv | ./my_script.py, and everything is fine:\n$ cat tempfile.csv | ./my_script.py \nPrinting a line:  line1\n\nPrinting a line:  line2\n\nOn the other hand, if I put pdb.set_trace() anywhere then I get an error.  For example, putting pdb.set_trace() below def main(), then I get \n$ cat tempfile.csv | ./my_script.py \n> /home/ilangmore/mobiuss/TM/branches/hadooprotype/my_script.py(7)main()\n-> for line in sys.stdin:\n(Pdb) *** NameError: name 'line1' is not defined\n(Pdb) *** NameError: name 'line2' is not defined\n(Pdb) \nTraceback (most recent call last):\n  File \"./my_script.py\", line 11, in <module>\n    status = main()\n  File \"./my_script.py\", line 7, in main\n    for line in sys.stdin:\n  File \"./my_script.py\", line 7, in main\n    for line in sys.stdin:\n  File \"/usr/lib/python2.7/bdb.py\", line 48, in trace_dispatch\n    return self.dispatch_line(frame)\n  File \"/usr/lib/python2.7/bdb.py\", line 67, in dispatch_line\n    if self.quitting: raise BdbQuit\nbdb.BdbQuit\n\nNote that my question is probably related to this question (i.e. pdb by default reads from stdin), but I need more help.\n", "output": "Here's an example of what worked for me:\nlines = sys.stdin.readlines()\nsys.stdin = open('/dev/tty')\npdb.set_trace()\n\n"}
{"input": "Read timeout using either urllib2 or any other http library I have code for reading an url like this:\nfrom urllib2 import Request, urlopen\nreq = Request(url)\nfor key, val in headers.items():\n    req.add_header(key, val)\nres = urlopen(req, timeout = timeout)\n# This line blocks\ncontent = res.read()\n\nThe timeout works for the urlopen() call. But then the code gets to the res.read() call where I want to read the response data and the timeout isn't applied there. So the read call may hang almost forever waiting for data from the server. The only solution I've found is to use a signal to interrupt the read() which is not suitable for me since I'm using threads.\nWhat other options are there? Is there a HTTP library for Python that handles read timeouts? I've looked at httplib2 and requests and they seem to suffer the same issue as above. I don't want to write my own nonblocking network code using the socket module because I think there should already be a library for this. \nUpdate: None of the solutions below are doing it for me. You can see for yourself that setting the socket or urlopen timeout has no effect when downloading a large file:\nfrom urllib2 import urlopen\nurl = 'http://iso.linuxquestions.org/download/388/7163/http/se.releases.ubuntu.com/ubuntu-12.04.3-desktop-i386.iso'\nc = urlopen(url)\nc.read()\n\nAt least on Windows with Python 2.7.3, the timeouts are being completely ignored.\n", "output": "I found in my tests (using the technique described here) that a timeout set in the urlopen() call also effects the read() call:\nimport urllib2 as u\nc = u.urlopen('http://localhost/', timeout=5.0)\ns = c.read(1<<20)\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\n  File \"/usr/lib/python2.7/socket.py\", line 380, in read\n    data = self._sock.recv(left)\n  File \"/usr/lib/python2.7/httplib.py\", line 561, in read\n    s = self.fp.read(amt)\n  File \"/usr/lib/python2.7/httplib.py\", line 1298, in read\n    return s + self._file.read(amt - len(s))\n  File \"/usr/lib/python2.7/socket.py\", line 380, in read\n    data = self._sock.recv(left)\nsocket.timeout: timed out\n\nMaybe it's a feature of newer versions?  I'm using Python 2.7 on a 12.04 Ubuntu straight out of the box.\n"}
{"input": "Compiling an IronPython WPF project to exe What is the best way to pack up an IronPython application for deployment? After scouring the web the best thing I've come up with (and what I'm currently doing) is using clr.CompileModules() to glue together my entire project's .py files into one .dll, and then having a single run.py do this to run the dll: \nimport clr\nclr.AddReference('compiledapp.dll')\n\nimport app\n\nThis is still suboptimal, though, because it means I have to\n\ndistribute 3 files (the .dll, the .xaml, and the run.py launcher) \ninstall IronPython on the host machine\n\nPlus, this feels so... hacky, after the wonderful integration IronPython already has with Visual Studio 2010. I'm completely mystified as to why there is no integrated build system for IPy apps, seeing as it all boils down to IL anyway.\nIdeally, I want to be able to have a single .exe with the .xaml merged inside somehow (I read that C# apps compile XAML to BAML and merge them in the executable), and without requiring IronPython to be installed to run. Is this at least halfway possible? (I suppose it's ok if the exe needs some extra .DLLs with it or something. The important part is that it's in .exe form.)\n\nSome edits to clarify: I have tried pyc.py, but it seems to not acknowledge the fact that my project is not just app.py. The size of the exe it produces suggests  that it is just 'compiling' app.py without including any of the other files into the exe. So, how do I tell it to compile every file in my project? \nTo help visualize this, here is a screenshot of my project's solution explorer window.\n\nEdit II: It seems that unfortunately the only way is to use pyc.py and pass every single file to it as a parameter. There are two questions I have for this approach:\n\nHow do I possibly process a command line that big? There's a maximum of 256 characters in a command.\nHow does pyc.py know to preserve the package/folder structure? As shown in my project screenshot above, how will my compiled program know to access modules that are in subfolders, such as accessing DT\\Device? Is the hierarchy somehow 'preserved' in the dll?\n\n\nEdit III: Since passing 70 filenames to pyc.py through the command line will be unwieldy, and in the interest of solving the problem of building IPy projects more elegantly, I've decided to augment pyc.py. \nI've added code that reads in a .pyproj file through the /pyproj: parameter, parses the XML, and grabs the list of py files used in the project from there. This has been working pretty well; however, the executable produced seems to be unable to access the python subpackages (subfolders) that are part of my project. My version of pyc.py with my .pyproj reading support patch can be found here: http://pastebin.com/FgXbZY29\nWhen this new pyc.py is run on my project, this is the output:\nc:\\Projects\\GenScheme\\GenScheme>\"c:\\Program Files (x86)\\IronPython 2.7\\ipy.exe\"\npyc.py /pyproj:GenScheme.pyproj /out:App /main:app.py /target:exe\nInput Files:\n        c:\\Projects\\GenScheme\\GenScheme\\__init__.py\n        c:\\Projects\\GenScheme\\GenScheme\\Agent.py\n        c:\\Projects\\GenScheme\\GenScheme\\AIDisplay.py\n        c:\\Projects\\GenScheme\\GenScheme\\app.py\n        c:\\Projects\\GenScheme\\GenScheme\\BaseDevice.py\n        c:\\Projects\\GenScheme\\GenScheme\\BaseManager.py\n        c:\\Projects\\GenScheme\\GenScheme\\BaseSubSystem.py\n        c:\\Projects\\GenScheme\\GenScheme\\ControlSchemes.py\n        c:\\Projects\\GenScheme\\GenScheme\\Cu64\\__init__.py\n        c:\\Projects\\GenScheme\\GenScheme\\Cu64\\agent.py\n        c:\\Projects\\GenScheme\\GenScheme\\Cu64\\aidisplays.py\n        c:\\Projects\\GenScheme\\GenScheme\\Cu64\\devmapper.py\n        c:\\Projects\\GenScheme\\GenScheme\\Cu64\\timedprocess.py\n        c:\\Projects\\GenScheme\\GenScheme\\Cu64\\ui.py\n        c:\\Projects\\GenScheme\\GenScheme\\decorators.py\n        c:\\Projects\\GenScheme\\GenScheme\\DeviceMapper.py\n        c:\\Projects\\GenScheme\\GenScheme\\DT\\__init__.py\n        c:\\Projects\\GenScheme\\GenScheme\\DT\\Device.py\n        c:\\Projects\\GenScheme\\GenScheme\\DT\\Manager.py\n        c:\\Projects\\GenScheme\\GenScheme\\DT\\SubSystem.py\n        c:\\Projects\\GenScheme\\GenScheme\\excepts.py\n        c:\\Projects\\GenScheme\\GenScheme\\FindName.py\n        c:\\Projects\\GenScheme\\GenScheme\\GenScheme.py\n        c:\\Projects\\GenScheme\\GenScheme\\PMX\\__init__.py\n        c:\\Projects\\GenScheme\\GenScheme\\PMX\\Device.py\n        c:\\Projects\\GenScheme\\GenScheme\\PMX\\Manager.py\n        c:\\Projects\\GenScheme\\GenScheme\\PMX\\SubSystem.py\n        c:\\Projects\\GenScheme\\GenScheme\\pyevent.py\n        c:\\Projects\\GenScheme\\GenScheme\\Scheme.py\n        c:\\Projects\\GenScheme\\GenScheme\\Simulated\\__init__.py\n        c:\\Projects\\GenScheme\\GenScheme\\Simulated\\Device.py\n        c:\\Projects\\GenScheme\\GenScheme\\Simulated\\SubSystem.py\n        c:\\Projects\\GenScheme\\GenScheme\\speech.py\n        c:\\Projects\\GenScheme\\GenScheme\\stdoutWriter.py\n        c:\\Projects\\GenScheme\\GenScheme\\Step.py\n        c:\\Projects\\GenScheme\\GenScheme\\TimedProcess.py\n        c:\\Projects\\GenScheme\\GenScheme\\UI.py\n        c:\\Projects\\GenScheme\\GenScheme\\VirtualSubSystem.py\n        c:\\Projects\\GenScheme\\GenScheme\\Waddle.py\nOutput:\n        App\nTarget:\n        ConsoleApplication\nPlatform:\n        ILOnly\nMachine:\n        I386\nCompiling...\nSaved to App\n\nSo it correctly read in the list of files in the .pyproj... Great! But running the exe gives me this:\nUnhandled Exception: IronPython.Runtime.Exceptions.ImportException: \nNo module named Cu64.ui\n\nSo even though Cu64\\ui.py is obviously included in compilation, the exe, when run, can't find it. This is what I was afraid of in point #2 in the previous edit. How do I preserve the package hierarchy of my project? Perhaps compiling each package seperately may be needed?\nI'll extend the bounty for this question. Ultimately my hope is that we can get a working pyc.py that reads in pyproj files and produces working exes in one step. Then maybe it could even be submitted to IronPython's codeplex to be included in the next release... ;]\n", "output": "It \"boils down to IL\", but it isn't compatible with the IL that C# code produces, so it can't be directly compiled to a standalone .exe file.\nYou'll need to use pyc.py to compile your code to a stub EXE with the DLL that CompileModules creates.\nThen distribute those files with IronPython.dll, IronPython.Modules.dll, Microsoft.Dynamic.dll, Microsoft.Scripting.Debugging.dll, Microsoft.Scripting.dll, and of course the XAML file.\nTo compile other files, just add them as arguments:\nipy.exe pyc.py /main:app.py /target:winexe another.py another2.py additional.py\n"}
{"input": "How to compute scipy sparse matrix determinant without turning it to dense? I am trying to figure out the fastest method to find the determinant of sparse symmetric and real matrices in python. using scipy sparse module but really surprised that there is no determinant function. I am aware I could use LU factorization to compute determinant but don't see a easy way to do it because the return of scipy.sparse.linalg.splu is an object and instantiating a dense L and U matrix is not worth it - I may as well do sp.linalg.det(A.todense()) where A is my scipy sparse matrix. \nI am also a bit surprised why others have not faced the problem of efficient determinant computation within scipy. How would one use splu to compute determinant? \nI looked into pySparse and scikits.sparse.chlmod. The latter is not practical right now for me - needs package installations and also not sure sure how fast the code is before I go into all the trouble. \nAny solutions? Thanks in advance. \n", "output": "The \"standard\" way to solve this problem is with a cholesky decomposition, but if you're not up to using any new compiled code, then you're out of luck. The best sparse cholesky implementation is Tim Davis's CHOLMOD, which is licensed under the LGPL and thus not available in scipy proper (scipy is BSD).\n"}
{"input": "dreaded \"not the same object error\" pickling a queryset.query object I have a queryset that I need to pickle lazily and I am having some serious troubles. cPickle.dumps(queryset.query) throws the following error:\nCan't pickle <class 'myproject.myapp.models.myfile.QuerySet'>: it's not the same object as myproject.myapp.models.myfile.QuerySet\n\nStrangely (or perhaps not so strangely), I only get that error when I call cPcikle from another method or a view, but not when I call it from the command line.\nI made the method below after reading PicklingError: Can't pickle <class 'decimal.Decimal'>: it's not the same object as decimal.Decimal and Django mod_wsgi PicklingError while saving object:\ndef dump_queryset(queryset, model):\n    from segment.segmentengine.models.segment import QuerySet\n    memo = {}\n    new_queryset = deepcopy(queryset, memo)\n    memo = {}\n    new_query = deepcopy(new_queryset.query, memo)\n    queryset = QuerySet(model=model, query=new_query)    \n    return cPickle.dumps(queryset.query)\n\nAs you can see, I am getting extremely desperate -- that method still yields the same error. Is there a known, non-hacky solution to this problem? \nEDIT: Tried using --noreload running on the django development server, but to no avail.\nEDIT2: I had a typo in the error I displayed above -- it was models.QuerySet, not models.mymodel.QuerySet that it was complaining about. There is another nuance here, which is that my models file is broken out into multiple modules, so the error is ACTUALLY:\n Can't pickle <class 'myproject.myapp.models.myfile.QuerySet'>: it's not the same object as myproject.myapp.models.myfile.QuerySet\n\nWhere myfile is one of the modules under models. I have an __ini__.py in models with the following line:\nfrom myfile import *\n\nI wonder if this is contributing to my issue. Is there some way to change my init to protect myself against this? Are there any other tests to try?\nEDIT3: Here is a little more background on my use case: I have a model called Context that I use to populate a UI element with instances of mymodel. The user can add/remove/manipulate the objects on the UI side, changing their context, and when they return, they can keep their changes, because the context serialized everything. A context has a generic foreign key to different types of filters/ways the user can manipulate the object, all of which must implement a few methods that the context uses to figure out what it should display. One such filter takes a queryset that can be passed in and displays all of the objects in that queryset. This provides a way to pass in arbitrary querysets that are produced elsewhere and have them displayed in the UI element. The model that uses the Context is hierarchical (using mptt for this), and the UI element makes a request to get children each time the user clicks around, we can then take the children and determine if they should be displayed based on whether or not they are included in the Context. Hope that helps!\nEDIT4: I am able to dump an empty queryset, but as soon as I add anything of value, it fails.\nEDIT4: I am on Django 1.2.3\n", "output": "Does this error only happen with cPickle? Have you tried it with standard Pickle?\nThere are other modules that have similar functionality to Pickle/cPickle, such as Gnosis Utilities' Pickle, which has the benefit of generating standard XML output using the gnosis.xml.pickle module. I'd try replacing cPickle with a different module to see if it has the same problem, or if it could be a bug in your cPickle implementation.\n"}
{"input": "Allowing only single active session per user in Django app I want to restrict logged-in users to only have one active session, i.e. if the user logs in with a new sessionid, the old session should be terminated.\nI found a lot of help on SO already:\nhere and here\nI implemented the middleware solution, with a bit of extra checking...\nclass OnlyOneUserMiddleware(object):\n\"\"\"\nMiddleware to ensure that a logged-in user only has one session active.\nWill kick out any previous session. \n\"\"\"\ndef process_request(self, request):\n    if request.user.is_authenticated():\n        try:\n            cur_session_key = request.user.get_profile().session_key\n            if cur_session_key and cur_session_key != request.session.session_key:\n                # Default handling... kick the old session...\n                Session.objects.get(session_key=cur_session_key).delete()\n            if not cur_session_key or cur_session_key != request.session.session_key:\n                p = request.user.get_profile()\n                p.session_key = request.session.session_key\n                p.save()\n        except ObjectDoesNotExist:\n            pass\n\nSo far, so good... on the Django dev server (manage.py runserver) everything works properly, it kicks the old session...\n...but when using Apache ( with mod_wsgi), it doesn't work!\nI've tried to find any information about this, but no luck so far... \nThe closest I have found is this, but it is kind of the 'opposite' problem...\nAny help would be much appreciated. \nEdit: I added a debug print before deleting the Session...\nhere's a snippet from Apache's error.log:\n[Fri Jan 20 09:56:50 2012] [error] old key = f42885ccb7f33b6afcb2c18fca14f44a\n[Fri Jan 20 09:56:50 2012] [error] new key = ce4cfb672e6025edb8ffcd0cf2b4b8d1\n[Fri Jan 20 09:57:14 2012] [error] old key = f42885ccb7f33b6afcb2c18fca14f44a\n[Fri Jan 20 09:57:14 2012] [error] new key = 0815c56241ac21cf4b14b326f0aa7e24\n\nthe first two lies are from when I entered with the first session (Firefox)\nthe last two are from when I entered with the second session (Chromium)\n... it turns out that the old Session record does not get deleted... ???\nI'm running vs. the exact same PostgreSQL instance as I did with the devserver...\nEdit2: It turned out that my code was buggy... it failed when the new Session_key wasn't found in Session...\nhere's the fixed code... the try..except is now in the correct place\nclass OnlyOneUserMiddleware(object):\n    \"\"\"\n    Middleware to ensure that a logged-in user only has one session active.\n    Will kick out any previous session. \n    \"\"\"\n    def process_request(self, request):\n        if request.user.is_authenticated():\n            cur_session_key = request.user.get_profile().session_key\n            if cur_session_key and cur_session_key != request.session.session_key:\n                # Default handling... kick the old session...\n                try:\n                    s = Session.objects.get(session_key=cur_session_key)\n                    s.delete()\n                except ObjectDoesNotExist:\n                    pass\n            if not cur_session_key or cur_session_key != request.session.session_key:\n                p = request.user.get_profile()\n                p.session_key = request.session.session_key\n                p.save()\n\n", "output": "You can always use this approach though not recommended, it works.\nmy_old_sessions = Session.objects.all()\nfor row in my_old_sessions:\n   if row.get_decoded().get(\"_username\") == request.user.username:\n      row.delete()\n\nYou would implement the code above in your login() function right before authenticating the user.\nThis of course only works if you have a login() function method that stores the USERS username in his session like follows:\nrequest.session[\"_username\"] = request.user.username\n\nIf you use this approach just remember to empty your database of all of your sessions before running your server after you've made these changes because it will raise KeyLookUp errors.\n"}
{"input": "Python socket receive - incoming packets always have a different size I'm using the SocketServer module for a TCP server.\nI'm experiencing some issue here with the recv() function, because the incoming packets always have a different size, so if I specify recv(1024) (I tried with a bigger value, and smaller), it gets stuck after 2 or 3 requests because the packet length will be smaller (I think), and then the server gets stuck until a timeout.\nclass Test(SocketServer.BaseRequestHandler):\n\ndef handle(self):\n\n   print \"From:\", self.client_address\n\n   while True:    \n\n     data = self.request.recv(1024)\n     if not data: break\n\n     if data[4] == \"\\x20\":              \n       self.request.sendall(\"hello\")\n     if data[4] == \"\\x21\":\n       self.request.sendall(\"bye\")\n     else:\n       print \"unknow packet\"\n   self.request.close()\n   print \"Disconnected\", self.client_address\n\nlaunch = SocketServer.ThreadingTCPServer(('', int(sys.argv[1])),Test)\n\nlaunch.allow_reuse_address= True;\n\nlaunch.serve_forever()\n\nIf the client sends multiples requests over the same source port, but the server gets stuck, any help would be very appreciated, thank !    \n", "output": "The answer by Larry Hastings has some great general advice about sockets, but there are a couple of mistakes as it pertains to how the recv(bufsize) method works in the Python socket module.\nSo, to clarify, since this may be confusing to others looking to this for help:\n\nThe bufsize param for the recv(bufsize) method is not optional.  You'll get an error if you call recv() (without the param).\nThe bufferlen in recv(bufsize) is a maximum size.  The recv will happily return fewer bytes if there are fewer available.\n\nSee the documentation for details.\nNow, if you're receiving data from a client and want to know when you've received all of the data, you're probably going to have to add it to your protocol -- as Larry suggests.  See this recipe for strategies for determining end of message.\nAs that recipe points out, for some protocols, the client will simply disconnect when it's done sending data.  In those cases, your while True loop should work fine.  If the client does not disconnect, you'll need to figure out some way to signal your content length, delimit your messages, or implement a timeout.\nI'd be happy to try to help further if you could post your exact client code and a description of your test protocol.\n"}
{"input": "Should I use Celery or Carrot for a Django project? I'm a little confused as to which one I should use. I think either will work, but is one better or more appropriate than the other?\nhttp://github.com/ask/carrot/tree/master\nhttp://github.com/ask/celery/tree/master\n", "output": "If you need to send/receive messages to/from AMQP message queues, use carrot.\nIf you want to run scheduled tasks on a number of machines, use celery.\nIf you're making soup, use both ;-)\n"}
{"input": "Is There An Operator To Calculate Percentage In Python? I've recently learned that the \" % \" sign is used to calculate the remainder of an integer in Python. However I was unable to determine if there's another operator or method to calculate percent in Python. Like with \" / \" which will give you the quotient, if you just use a float for one of the integers it will actually give you the answer like traditional division. So any method for percentage?\n", "output": "You could just divide your two numbers and multiply by 100. Note that this will throw an error if \"whole\" is 0, as asking what percentage of 0 a number is does not make sense:\ndef percentage(part, whole):\n  return 100 * float(part)/float(whole)\n\nOr if the question you wanted it to answer was \"what is 5% of 20\", rather than \"what percentage is 5 of 20\" (a different interpretation of the question inspired by Carl Smith's answer), you would write:\ndef percentage(percent, whole):\n  return (percent * whole) / 100.0\n\n"}
{"input": "What's the simplest way to subtract a month from a date in Python? If only timedelta had a month argument in it's constructor.  So what's the simplest way to do this?\nEDIT: I wasn't thinking too hard about this as was pointed out below.  Really what I wanted was any day in the last month because eventually I'm going to grab the year and month only.  So given a datetime object, what's the simplest way to return any datetime object that falls in the previous month?\n", "output": "You can use the third party dateutil module (PyPI entry here).\nimport datetime\nimport dateutil.relativedelta\n\nd = datetime.datetime.strptime(\"2013-03-31\", \"%Y-%m-%d\")\nd2 = d - dateutil.relativedelta.relativedelta(months=1)\nprint d2\n\noutput:\n2013-02-28 00:00:00\n\n"}
{"input": "Storing and updating lists in Python dictionaries: why does this happen? I have a list of data that looks like the following:\n// timestep,x_position,y_position\n0,4,7\n0,2,7\n0,9,5\n0,6,7\n1,2,5\n1,4,7\n1,9,0\n1,6,8\n\n... and I want to make this look like:\n0, (4,7), (2,7), (9,5), (6,7)\n1, (2,5), (4,7), (9,0), (6.8)\n\nMy plan was to use a dictionary, where the value of t is the key for the dictionary, and the value against the key would be a list. I could then append each (x,y) to the list. Something like:\n# where t = 0, c = (4,7), d = {}\n\n# code 1\nd[t].append(c)\n\nNow this causes IDLE to fail. However, if I do:\n# code 2\nd[t] = []\nd[t].append(c)\n\n... this works. \nSo the question is: why does code 2 work, but code 1 doesn't?\nPS Any improvement on what I'm planning on doing would be of great interest!! I think I will have to check the dictionary on each loop through the input to see if the dictionary key already exists, I guess by using something like max(d.keys()): if it is there, append data, if not create the empty list as the dictionary value, and then append data on the next loop through. \n", "output": "Let's look at\nd[t].append(c)\n\nWhat is the value of d[t]?  Try it.\nd = {}\nt = 0\nd[t]\n\nWhat do you get?  Oh.  There's nothing in d that has a key of t.\nNow try this.\nd[t] = []\nd[t]\n\nAhh.  Now there's something in d with a key of t.\nThere are several things you can do.  \n\nUse example 2.\nUse setdefault.  d.setdefault(t,[]).append(c).\nUse collections.defaultdict.  You'd use a defaultdict(list) instead of a simple dictionary, {}.\n\n\nEdit 1.  Optimization\nGiven input lines from a file in the above form: ts, x, y, the grouping process is needless.  There's no reason to go from a simple list of ( ts, x, y ) to a more complex\nlist of ( ts, (x,y), (x,y), (x,y), ... ).  The original list can be processed exactly as it arrived.\nd= collections.defaultdict(list)\nfor ts, x, y in someFileOrListOrQueryOrWhatever:\n    d[ts].append( (x,y) )\n\n\nEdit 2.  Answer Question\n\"when initialising a dictionary, you need to tell the dictionary what the key-value data structure will look like?\"\nI'm not sure what the question means.  Since, all dictionaries are key-value structures, the question's not very clear.  So, I'll review the three alternatives, which may answer the question.\nExample 2.\nInitialization\nd= {}\n\nUse\nif t not in d:\n    d[t] = list()\nd[t].append( c )\n\nEach dictionary value must be initialized to some useful structure.  In this case, we check to see if the key is present; when the key is missing, we create the key and assign an empty list.\nSetdefault\nInitialization\nd= {}\n\nUse\nd.setdefault(t,list()).append( c )\n\nIn this case, we exploit the setdefault method to either fetch a value associated with a key or create a new value associated with a missing key.\ndefault dict\nInitialization\nimport collections\nd = collections.defaultdict(list)\n\nUse\nd[t].append( c )\n\nThe defaultdict uses an initializer function for missing keys.  In this case, we provide the list function so that a new, empty list is created for a missing key.\n"}
{"input": "Database on the fly with scripting languages I have a set of .csv files that I want to process. It would be far easier to process it with SQL queries. I wonder if there is some way to load a .csv file and use SQL language to look into it with a scripting language like python or ruby. Loading it with something similar to ActiveRecord would be awesome. \nThe problem is that I don't want to have to run a database somewhere prior to running my script. I souldn't have additionnal installations needed outside of the scripting language and some modules.\nMy question is which language and what modules should I use for this task. I looked around and can't find anything that suits my need. Is it even possible?\n", "output": "There's sqlite3, included into python. With it you can create a database (on memory) and add rows to it, and perform SQL queries.\nIf you want neat ActiveRecord-like functionality you should add an external ORM, like sqlalchemy. That's a separate download though\nQuick example using sqlalchemy:\nfrom sqlalchemy import create_engine, Column, String, Integer, MetaData, Table\nfrom sqlalchemy.orm import mapper, create_session\nimport csv\nCSV_FILE = 'foo.csv'\nengine = create_engine('sqlite://') # memory-only database\n\ntable = None\nmetadata = MetaData(bind=engine)\nwith open(CSV_FILE) as f:\n    # assume first line is header\n    cf = csv.DictReader(f, delimiter=',')\n    for row in cf:\n        if table is None:\n            # create the table\n            table = Table('foo', metadata, \n                Column('id', Integer, primary_key=True),\n                *(Column(rowname, String()) for rowname in row.keys()))\n            table.create()\n        # insert data into the table\n        table.insert().values(**row).execute()\n\nclass CsvTable(object): pass\nmapper(CsvTable, table)\nsession = create_session(bind=engine, autocommit=False, autoflush=True)\n\nNow you can query the database, filtering by any field, etc.\nSuppose you run the code above on this csv:\nname,age,nickname\nnosklo,32,nosklo\nAfila Tun,32,afilatun\nFoo Bar,33,baz\n\nThat will create and populate a table in memory with fields name, age, nickname. You can then query the table:\nfor r in session.query(CsvTable).filter(CsvTable.age == '32'):\n    print r.name, r.age, r.nickname\n\nThat will automatically create and run a SELECT query and return the correct rows.\nAnother advantage of using sqlalchemy is that, if you decide to use another, more powerful database in the future, you can do so pratically without changing the code.\n"}
{"input": "Replacing a substring of a string with Python I'd like to get a few opinions on the best way to replace a substring of a string with some other text. Here's an example:\nI have a string, a, which could be something like \"Hello my name is $name\". I also have another string, b, which I want to insert into string a in the place of its substring '$name'.\nI assume it would be easiest if the replaceable variable is indicated some way. I used a dollar sign, but it could be a string between curly braces or whatever you feel would work best.\nSolution:\nHere's how I decided to do it:\nfrom string import Template\n\n\nmessage = 'You replied to $percentageReplied of your message. ' + \n    'You earned $moneyMade.'\n\ntemplate = Template(message)\n\nprint template.safe_substitute(\n    percentageReplied = '15%',\n    moneyMade = '$20')\n\n", "output": "Here are the most common ways to do it:\n>>> import string\n>>> t = string.Template(\"Hello my name is $name\")\n>>> print t.substitute(name='Guido')\nHello my name is Guido\n\n>>> t = \"Hello my name is %(name)s\"\n>>> print t % dict(name='Tim')\nHello my name is Tim\n\n>>> t = \"Hello my name is {name}\"\n>>> print t.format(name='Barry')\nHello my name is Barry\n\nThe approach using string.Template is easy to learn and should be familiar to bash users.  It is suitable for exposing to end-users.  This style became available in Python 2.4.\nThe percent-style will be familiar to many people coming from other programming languages.  Some people find this style to be error-prone because of the trailing \"s\" in %(name)s, because the %-operator has the same precedence as multiplication, and because the    behavior of the applied arguments depends on their data type (tuples and dicts get special handling).  This style has been supported in Python since the beginning.\nThe curly-bracket style is only supported in Python 2.6 or later.  It is the most flexible style (providing a rich set of control characters and allowing objects to implement custom formatters).\n"}
{"input": "Checking date against date range in Python I have a date variable: 2011-01-15 and I would like to get a boolean back if said date is within 3 days from TODAY. Im not quite sure how to construct this in Python. Im only dealing with date, not datetime.\nMy working example is a \"grace period\". A user logs into my site and if the grace period is within 3 days of today, additional scripts, etc. are omitted for that user.\nI know you can do some fancy/complex things in Python's date module(s) but Im not sure where to look.\n", "output": "In Python to check a range you can use a <= x <= b:\n>>> import datetime\n>>> today = datetime.date.today()\n>>> margin = datetime.timedelta(days = 3)\n\n>>> today - margin <= datetime.date(2011, 1, 15) <= today + margin\nTrue\n\n"}
{"input": "Python comments: # vs strings I have a question regarding the \"standard\" way to put comments inside Python source code:\ndef func():\n    \"Func doc\"\n    ... <code>\n    'TODO: fix this'\n    #badFunc()\n    ... <more code>\n\ndef func():\n    \"Func doc\"\n    ... <code>\n    #TODO: fix this\n    #badFunc()\n    ... <more code>\n\nI prefer to write general comments as strings instead of prefixing #'s.\nThe official python style guide doesn't mention using strings as comments (If I didn't miss it while reading it)\nI like it that way mainly because I think the # character looks ugly with comment blocks. As far as I know these strings don't do anything.\nThe question is: Are there disadvantages in doing this?\n", "output": "Don't misuse strings (no-op statements) as comments. Docstrings, e.g. the first string in a module, class or function, are special and definitely recommended.\nNote that docstrings are documentation, and documentation and comments are two different things!\n\nDocumentation is important to understand what the code does.\nComments explain how the code does it.\n\nDocumentation is read by people who use your code, comments by people who want to understand your code, e.g. in order to maintain it.\nUsing strings for commentation has the following (potential) disadvantages:\n\nIt confuses people who don't know that the string does nothing.\nComments and string literals are highlighted differently in code editors, so your style may make your code harder to read.\nIt might effect performance and/or memory usage (if the strings are not removed during bytecode compilation, removing comments is done on the scanner level so its definitively cheaper)\n\nMost important for Python programmers: It is not pythonic:\n\nThere should be one-- and preferably only one --obvious way to do it.\n\nStick to the standards, use comments.\n"}
{"input": "number of values in a list greater than a certain number I have a list of numbers and I want to get the number of times a number appears in a list that meets a certain criteria. I can use a list comprehension (or a list comprehension in a function) but I am wondering if someone has a shorter way.\n# list of numbers\nj=[4,5,6,7,1,3,7,5]\n#list comprehension of values of j > 5\nx = [i for i in j if i>5]\n#value of x\nlen(x)\n\n#or function version\ndef length_of_list(list_of_numbers, number):\n     x = [i for i in list_of_numbers if j > number]\n     return len(x)\nlength_of_list(j, 5)\n\nis there an even more condensed version?\n", "output": "You could do something like this:\n>>> j = [4, 5, 6, 7, 1, 3, 7, 5]\n>>> sum(i > 5 for i in j)\n3\n\nIt might initially seem strange to add True to True this way, but I don't think it's unpythonic; after all, bool is a subclass of int in all versions since 2.3:\n>>> issubclass(bool, int)\nTrue\n\n"}
{"input": "Python: read all text file lines in loop I want to read huge text file line by line (and stop if a line with \"str\" found).\nHow to check, if file-end is reached?\nfn = 't.log'\nf = open(fn, 'r')\nwhile not _is_eof(f): ## how to check that end is reached?\n    s = f.readline()\n    print s\n    if \"str\" in s: break\n\n", "output": "There's no need to check for EOF in python, simply do:\nwith open('t.ini') as f:\n   for line in f:\n       print line\n       if 'str' in line:\n          break\n\nWhy the with statement:\n\nIt is good practice to use the with keyword when dealing with file\n  objects. This has the advantage that the file is properly closed after\n  its suite finishes, even if an exception is raised on the way.\n\n"}
{"input": "Python: namedtuple._replace() doesn't work as descrbed in the documentation I was having trouble implementing namedtuple._replace(), so I copied the code right off of the documentation:\nPoint = namedtuple('Point', 'x,y')\n\np = Point(x=11, y=22)\n\np._replace(x=33)\n\nprint p\n\nand I got:\nPoint(x=11, y=22)\n\ninstead of:\nPoint(x=33, y=22)\n\nas is shown in the doc.\nI'm using Python 2.6 on Windows 7\nWhat's going on?\n", "output": "Yes it does, it works exactly as documented.\n._replace returns a new namedtuple, it does not modify the original, so you need to write this:\np = p._replace(x=33)\n\nSee here: somenamedtuple._replace(kwargs) for more information.\n"}
{"input": "Calculating Time Difference at the start and end of my program, I have \nfrom time import strftime\nprint int(strftime(\"%Y-%m-%d %H:%M:%S\")\n\n\n\nY1=int(strftime(\"%Y\"))\nm1=int(strftime(\"%m\"))\nd1=int(strftime(\"%d\"))\nH1=int(strftime(\"%H\"))\nM1=int(strftime(\"%M\"))\nS1=int(strftime(\"%S\"))\n\n\nY2=int(strftime(\"%Y\"))\nm2=int(strftime(\"%m\"))\nd2=int(strftime(\"%d\"))\nH2=int(strftime(\"%H\"))\nM2=int(strftime(\"%M\"))\nS2=int(strftime(\"%S\"))\n\nprint \"Difference is:\"+str(Y2-Y1)+\":\"+str(m2-m1)+\":\"+str(d2-d1)\\\n          +\" \"+str(H2-H1)+\":\"+str(M2-M1)+\":\"+str(S2-S1)\n\nBut when I tried to get the difference, I get syntax errors.... I am doing a few things wrong, but I'm not sure what is going on...\nBasically, I just want to store a time in a variable at the start of my program, then store a 2nd time in a second variable near the end, then at the last bit of the program, compute the difference and display it. I am not trying to time a function speed. I am trying to log how long it took for a user to progress through some menus. What is the best way to do this?\n", "output": "The datetime module will do all the work for you:\n>>> import datetime\n>>> a = datetime.datetime.now()\n>>> # ...wait a while...\n>>> b = datetime.datetime.now()\n>>> print(b-a)\n0:03:43.984000\n\nIf you don't want to display the microseconds, just use (as gnibbler suggested):\n>>> a = datetime.datetime.now().replace(microsecond=0)\n>>> b = datetime.datetime.now().replace(microsecond=0)\n>>> print(b-a)\n0:03:43\n\n"}
{"input": "Escape double quotes for JSON in Python How can I replace double quotes with a backslash and double quotes in Python?  \n>>> s = 'my string with \"double quotes\" blablabla'\n>>> s.replace('\"', '\\\\\"')\n'my string with \\\\\"double quotes\\\\\" blablabla'\n>>> s.replace('\"', '\\\\\\\"')\n'my string with \\\\\"double quotes\\\\\" blablabla'\n\nI would like to get the following:\n'my string with \\\"double quotes\\\" blablabla'\n\n", "output": "You should be using the json module. json.dumps(string). It can also serialize other python data types.\nimport json\n\n>>> s = 'my string with \"double quotes\" blablabla'\n\n>>> json.dumps(s)\n<<< '\"my string with \\\\\"double quotes\\\\\" blablabla\"'\n\n"}
{"input": "python pip install psycopg2 install error I did a simple pip install psycopg2 on mac system. It installed fine, but when I try to use psycopg2 I get the error:\nReason: Incompatible library version: _psycopg.so requires version 1.0.0 or later, but libssl.0.9.8.dylib provides version 0.9.8\n\npip freeze shows psycopg2==2.4.5 just right. I have installed psycopg2 on several virtualenvs but this is the first time I am seeing such error. I tried uninstalling and reinstalling, same results. Please help\n", "output": "The accepted answer here is correct (except I think it must be ln -fs , in fact I think it might even risk destabalizing your OS if not (?)). After bumping into this and dealing with it I just want to collect the full solution for this issue and the other lib problem (libcrypto.1.0.0.dylib) you will run into for Postgres 9.* on Mountain Lion and Snow Leopard, and perhaps other systems. This also blocked me from running psql, which complained about the same two libs.\nEssentially there are two later-version libs needed in /usr/lib, libssl and libcrypto. You can find the needed versions of these libs in the Postgres lib directory. \n\nIf you're OSX and installed the Enterprise DB version of Postgres this will be in /Library/PostgreSQL/9.2/lib.\nFor other install types of Postgres, look for the lib directory inside the Postgress install directory, e.g., for Postgress.app, find the lib directory in /Applications/Postgres.app/Contents/MacOS/lib,\nfor brew somewhere in /usr/local/Cellar,\non *nix, wherever your install is. But see first on *nix if your distro has later versions just through the package manager.\n\nFirst copy the latest of these two libs from the Postgres lib directory to /usr/lib:\nsudo cp /Library/PostgreSQL/9.2/lib/libssl.1.0.0.dylib /usr/lib\nsudo cp /Library/PostgreSQL/9.2/lib/libcrypto.1.0.0.dylib /usr/lib\n\nThen update (or create) the /usr/lib symlinks for this libs. Either way the command is ln -fs:\nsudo ln -fs /usr/lib/libssl.1.0.0.dylib /usr/lib/libssl.dylib\nsudo ln -fs /usr/lib/libcrypto.1.0.0.dylib /usr/lib/libcrypto.dylib\n\nShould be fixed. Pretty sure ln -fs is better than deleting the symlink and remaking it, so there is less chance of libssl being unfindable by something that needs it for the time it is not present (it does the same thing; it first deletes the symlink if it's already there, just faster than you can type it). Always wary of messing around on /usr/lib.\n"}
{"input": "ImportError: cannot import name SignedJwtAssertionCredentials I'm trying to access a google app through the Python Client using this code to gain authorization (private info obviously redacted): \nimport gflags\nimport httplib2\n\nfrom apiclient.discovery import build\nfrom oauth2client.file import Storage\nfrom oauth2client.client import SignedJwtAssertionCredentials\nfrom oauth2client.tools import run\n\nf = open('privatekey.p12', 'rb')\nkey = f.read()\nf.close()\ncredentials = SignedJwtAssertionCredentials(\n    service_account_name='name@developer.gserviceaccount.com',\n    private_key=key,\n    scope = 'https://www.googleapis.com/auth/calendar')\nhttp = httplib2.Http()\nhttp = credentials.authorize(http)\nservice = build(serviceName='calendar', version='v3', http=http)\n\nYet I receive this error: \nImportError: cannot import name SignedJwtAssertionCredentials\n\nI have installed the Google v3 API Python Client as well as OAuth2; I don't seem to be having any other problems with those modules, though I haven't used them much. Anyone know what's going on? \n", "output": "I had this problem today and had to roll back from oauth2client version 2.0 to version 1.5.2 with:\npip install oauth2client==1.5.2\n\n"}
{"input": "what is blocking and non-blocking web server, what difference between both? i have seen many web framework provide non-blocking web server, i just want to know what it mean.\n", "output": "a blocking web-server is similar to a phone call. you need to wait on-line to get a response and continue; where as a non-blocking web-server is like a sms service. you sms your request,do your things and react when you receive an sms back!\n"}
{"input": "Django App Dependency Cycle I am in the middle of developing a Django application, which has quite complicated models (it models a university - courses, modules, lectures, students etc.)\nI have separated the project into apps, to make the whole thing more organised (apps are courses, schools, people, modules and timeperiods). I am having a problem whereby a model in one app may depend on a model in another - so I must import it. The second app then in turn depends on a model in the first, so there is a cycle and Python throws up an error.\nHow do people deal with this? I understand that apps should be relatively \"independent\", but in a system like this it doesn't make sense, for example, to use ContentTypes to link students to a module.\nDoes anyone have a similar project that could comment on this case?\n", "output": "If your dependency is with foreign keys referencing models in other applications, you don't need to import the other model. You can use a string in your ForeignKey definition:\nclass MyModel(models.Model):\n    myfield = models.ForeignKey('myotherapp.MyOtherModel')\n\nThis way there's no need to import MyOtherModel, so no circular reference. Django resolves the string internally, and it all works as expected.\n"}
{"input": "Iteration over list slices Good day function-wizards,\nI want an algorithm to iterate over list slices. Slices size is set outside the function and can differ.\nIn my mind it is something like:\nfor list_of_x_items in fatherList:\n    foo(list_of_x_items)\n\nIs there a way to properly define list_of_x_items or some other way of doing this?\nThank you greatly.\nPS: using python 2.5\n\nedit1: So many beautiful answers.. Someone has asked to clarify. Both \"partitioning\" and \"sliding window\" terms sound applicable to my task, but I am no expert. So I will explain  the problem a bit deeper and add to the question:\nThe fatherList is a multilevel numpy.array I am getting from a file. Function has to find averages of series (user provides the length of series) For averaging I am using the mean() function. Now for question expansion:\nquestion update: How to modify the function you have provided to store the extra items and use them when the next fatherList is fed to the function?\nfor example if the list is lenght 10 and size of a chunk is 3, then the 10th member of the list is stored and appended to the beginning of the next list.\nHope I am not assaulting anyone by not creating a separate question.\n\nedit2: Ah, SO will you marry me? Many beautiful answers and only one to choose.. Thank you everyone, I've learned more from this question than from a month of my university studies.\nI am choosing the OO answer because it will allow me more flexibility in the future.\n\nRelated:\n\nWhat is the most \u00e2\u0080\u009cpythonic\u00e2\u0080\u009d way to iterate over a list in chunks?\n\n", "output": "If you want to divide a list into slices you can use this trick:\nlist_of_slices = zip(*(iter(the_list),) * slice_size)\n\nFor example\n>>> zip(*(iter(range(10)),) * 3)\n[(0, 1, 2), (3, 4, 5), (6, 7, 8)]\n\nIf the number of items is not dividable by the slice size and you want to pad the list with None you can do this:\n>>> map(None, *(iter(range(10)),) * 3)\n[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, None, None)]\n\nIt is a dirty little trick\n\nOK, I'll explain how it works. It'll be tricky to explain but I'll try my best.\nFirst a little background:\nIn Python you can multiply a list by a number like this:\n[1, 2, 3] * 3 -> [1, 2, 3, 1, 2, 3, 1, 2, 3]\n([1, 2, 3],) * 3 -> ([1, 2, 3], [1, 2, 3], [1, 2, 3])\n\nAnd an iterator object can be consumed once like this:\n>>> l=iter([1, 2, 3])\n>>> l.next()\n1\n>>> l.next()\n2\n>>> l.next()\n3\n\nThe zip function returns a list of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. For example:\nzip([1, 2, 3], [20, 30, 40]) -> [(1, 20), (2, 30), (3, 40)]\nzip(*[(1, 20), (2, 30), (3, 40)]) -> [[1, 2, 3], [20, 30, 40]]\n\nThe * in front of zip used to unpack arguments. You can find more details here.\nSo\nzip(*[(1, 20), (2, 30), (3, 40)])\n\nis actually equivalent to \nzip((1, 20), (2, 30), (3, 40))\n\nbut works with a variable number of arguments\nNow back to the trick:\nlist_of_slices = zip(*(iter(the_list),) * slice_size)\n\niter(the_list) -> convert the list into an iterator\n(iter(the_list),) * N -> will generate an N reference to the_list iterator.\nzip(*(iter(the_list),) * N) -> will feed those list of iterators into zip. Which in turn will group them into N sized tuples. But since all N items are in fact references to the same iterator iter(the_list) the result will be repeated calls to next() on the original iterator\nI hope that explains it. I advice you to go with an easier to understand solution. I was only tempted to mention this trick because I like it.\n"}
{"input": "Silence the stdout of a function in Python without trashing sys.stdout and restoring each function call Is there a way in Python to silence stdout without wrapping a function call like following?\nOriginal Broken Code:\nfrom sys import stdout\nfrom copy import copy\nsave_stdout = copy(stdout)\nstdout = open('trash','w')\nfoo()\nstdout = save_stdout\n\nEdit: Corrected code from Alex Martelli\nimport sys\nsave_stdout = sys.stdout\nsys.stdout = open('trash', 'w')\nfoo()\nsys.stdout = save_stdout\n\nThat way works but appears to be terribly inefficient. There has to be a better way. Any ideas?\n", "output": "Assigning the stdout variable as you're doing has no effect whatsoever, assuming foo contains print statements -- yet another example of why you should never import stuff from inside a module (as you're doing here), but always a module as a whole (then use qualified names).  The copy is irrelevant, by the way.  The correct equivalent of your snippet is:\nimport sys\nsave_stdout = sys.stdout\nsys.stdout = open('trash', 'w')\nfoo()\nsys.stdout = save_stdout\n\nNow, when the code is correct, is the time to make it more elegant or fast.  For example, you could use an in-memory file-like object instead of file 'trash':\nimport sys\nimport cStringIO\nsave_stdout = sys.stdout\nsys.stdout = cStringIO.StringIO()\nfoo()\nsys.stdout = save_stdout\n\nfor elegance, a context is best, e.g:\nimport contextlib\nimport sys\nimport cStringIO\n\n@contextlib.contextmanager\ndef nostdout():\n    save_stdout = sys.stdout\n    sys.stdout = cStringIO.StringIO()\n    yield\n    sys.stdout = save_stdout\n\nonce you have defined this context, for any block in which you don't want a stdout,\nwith nostdout():\n    foo()\n\nMore optimization: you just need to replace sys.stdout with an object that has a no-op write method.  For example:\nimport contextlib\nimport sys\n\nclass DummyFile(object):\n    def write(self, x): pass\n\n@contextlib.contextmanager\ndef nostdout():\n    save_stdout = sys.stdout\n    sys.stdout = DummyFile()\n    yield\n    sys.stdout = save_stdout\n\nto be used the same way as the previous implementation of nostdout.  I don't think it gets any cleaner or faster than this;-).\n"}
{"input": "Create or append to a list in a dictionary - can this be shortened? Can this Python code be shortened and still be readable using itertools and sets?\nresult = {}\nfor widget_type, app in widgets:\n    if widget_type not in result:\n        result[widget_type] = []\n    result[widget_type].append(app)\n\nI can think of this only:\nwidget_types = zip(*widgets)[0]\ndict([k, [v for w, v in widgets if w == k]) for k in set(widget_types)])\n\n", "output": "An alternative to defaultdict is to use the setdefault method of standard dictionaries:\n result = {}\n for widget_type, app in widgets:\n     result.setdefault(widget_type, []).append(app)\n\nThis relies on the fact that lists are mutable, so what is returned from setdefault is the same list as the one in the dictionary, therefore you can append to it.\n"}
{"input": "Passing command Line argument to Python script within Eclipse(Pydev) I am new to Python & Eclipse, and having some difficulties understanding how to pass command line argument to script running within Eclipse(Pydev).\nThe following link explains how to pass command line argument to python script.\nTo pass command line argument to module argecho.py(code from link above),\n#argecho.py\nimport sys\n\nfor arg in sys.argv: 1\n    print arg\n\nI would need to type into python console \n[you@localhost py]$ python argecho.py             \nargecho.py\n\nor\n[you@localhost py]$ python argecho.py abc def     \nargecho.py\nabc\ndef\n\nHow would I pass same arguments to Python script within Eclipse(Pydev) ???\nThanks !\n", "output": "Click on the play button down arrow in the tool bar -> run configurations -> (double click) Python Run -> Arguments tab on the right hand side. \nFrom there you can fill out the Program Arguments text box:\n\n"}
{"input": "Extracting a region from an image using slicing in Python, OpenCV I have an image and I want to extract a region from it. I have coordinates of left upper corner and right lower corner of this region. In gray scale I do it like this:\nI = cv2.imread(\"lena.png\")\nI = cv2.cvtColor(I, cv2.COLOR_RGB2GRAY)\nregion = I[248:280,245:288]\ntools.show_1_image_pylab(region)\n\nI can't figure it out how to do it in color. I thought of extracting each channel R, G, B; slicing this region from each of the channels and to merge them back together but there is gotta be a shorter way. \n", "output": "There is a slight difference in pixel ordering in OpenCV and Matplotlib.\nOpenCV follows BGR order, while matplotlib likely follows RGB order.\nSo when you display an image loaded in OpenCV using pylab functions, you may need to convert it into RGB mode. ( I am not sure if any easy method is there). Below method demonstrate it:\nimport cv2\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nimg = cv2.imread('messi4.jpg')\nb,g,r = cv2.split(img)\nimg2 = cv2.merge([r,g,b])\nplt.subplot(121);plt.imshow(img) # expects distorted color\nplt.subplot(122);plt.imshow(img2) # expect true color\nplt.show()\n\ncv2.imshow('bgr image',img) # expects true color\ncv2.imshow('rgb image',img2) # expects distorted color\ncv2.waitKey(0)\ncv2.destroyAllWindows()\n\nNB : Please check @Amro 's comment below for better method of conversion between BGR and RGB. img2 = img[:,:,::-1] . Very simple.\nRun this code and see the difference in result yourself. Below is what I got :\nUsing Matplotlib :\n\nUsing OpenCV :\n\n"}
{"input": "Setting the window to a fixed size with Tkinter This program will create a window where a message is displayed according to a check box.\nfrom Tkinter import *\n\nclass App:\n    def __init__(self,master):\n        self.var = IntVar()\n        frame = Frame(master)\n        frame.grid()\n        f2 = Frame(master,width=200,height=100)\n        f2.grid(row=0,column=1)\n        button = Checkbutton(frame,text='show',variable=self.var,command=self.fx)\n        button.grid(row=0,column=0)\n        msg2=\"\"\"I feel bound to give them full satisfaction on this point\"\"\"\n        self.v= Message(f2,text=msg2)\n    def fx(self):\n        if self.var.get():\n            self.v.grid(column=1,row=0,sticky=N)\n        else:\n            self.v.grid_remove()\n\ntop = Tk()\napp = App(top)            \ntop.mainloop()\n\nHow can I make the window size constant when the message is displayed and the message is not displayed.\n", "output": "This code makes a window with the conditions that the user cannot change the dimensions of the Tk() window, and also disables the maximise button.\nimport tkinter as tk\n\nroot = tk.Tk()\nroot.resizable(width=False, height=False)\nroot.mainloop()\n\nWithin the program you can change the window dimensions with @Carpetsmoker's answer, or by doing this:\nroot.geometry('{}x{}'.format(<widthpixels>, <heightpixels>))\n\nIt should be fairly easy for you to implement that into your code. :)\n"}
{"input": "General Command pattern and Command Dispatch pattern in Python I was looking for a Command pattern implementation in Python...\n(According to Wikipedia, \n\nthe command pattern is a design\n  pattern in which an object is used to\n  represent and encapsulate all the\n  information needed to call a method at\n  a later time.\n\n)\nThe only thing I found was Command Dispatch pattern:\nclass Dispatcher:\n\n    def do_get(self): ...\n\n    def do_put(self): ...\n\n    def error(self): ...\n\n    def dispatch(self, command):\n        mname = 'do_' + command\n        if hasattr(self, mname):\n            method = getattr(self, mname)\n            method()\n        else:\n            self.error()\n\nMay be I'm wrong, but it looks like these are two different concepts, which accidentally have similar names.\nAm i missing something?\n", "output": "The simplest command pattern is already built into Python, simply use a callable:\ndef greet(who):\n    print \"Hello %s\" % who\n\ngreet_command = lambda: greet(\"World\")\n# pass the callable around, and invoke it later\ngreet_command()\n\nThe command pattern as an object oriented design pattern makes more sense if your commands need to be able to do more than just be invoked. Common usecase is when you need to be able to undo/redo your actions. Then a command class is a good way to couple the forward and backwards actions together. For example:\nclass MoveFileCommand(object):\n    def __init__(self, src, dest):\n        self.src = src\n        self.dest = dest\n        os.rename(self.src, self.dest)\n    def undo(self):\n        os.rename(self.dest, self.src)\n\nundo_stack = []\nundo_stack.append(MoveFileCommand('foo.txt', 'bar.txt'))\nundo_stack.append(MoveFileCommand('bar.txt', 'baz.txt'))\n# foo.txt is now renamed to baz.txt\nundo_stack.pop().undo() # Now it's bar.txt\nundo_stack.pop().undo() # and back to foo.txt\n\n"}
{"input": "Is there a way to get a list of column names in sqlite? I want to get a list of column names from a table in a database. Using pragma I get a list of tuples with a lot of unneeded information. Is there a way to get only the column names? So I might end up with something like this:\n\n[Column1, Column2, Column3, Column4]\n\nThe reason why I absolutely need this list is because I want to search for a column name in the list and get the index because the index is used in a lot of my code.\nIs there a way of getting a list like this?\nThanks\n", "output": "You can use sqlite3 and pep-249\nimport sqlite3\nconnection = sqlite3.connect('~/foo.sqlite')\ncursor = connection.execute('select * from bar')\n\ncursor.description is description of columns\nnames = list(map(lambda x: x[0], cursor.description))\n\nAlternatively you could use a list comprehension:\nnames = [description[0] for description in cursor.description]\n\n"}
{"input": "ImportError: No module named sqlalchemy I'm unable to find a module in python ,though easy_install says its already installed.\nAny idea how to resolve this isseue? \n$ python -c \"from flaskext.sqlalchemy import SQLAlchemy\"\nTraceback (most recent call last):\n  File \"<string>\", line 1, in <module>\nImportError: No module named sqlalchemy\n\n\n$ python -V\nPython 2.7\n\n\n$ sudo easy_install sqlalchemy\nSearching for sqlalchemy\nBest match: SQLAlchemy 0.7.7\nAdding SQLAlchemy 0.7.7 to easy-install.pth file\n\nUsing /usr/lib/python2.7/site-packages\nProcessing dependencies for sqlalchemy\nFinished processing dependencies for sqlalchemy\n\n\n$ sudo pip install SQLAlchemy --upgrade Requirement already\n  up-to-date: SQLAlchemy in /usr/lib/python2.7/site-packages Cleaning\n  up...\n\nThough pip says it's installed.But I can't find them in sys.path output.\n$ sudo python -c \"import sys;print sys.path\" ['',\n'/usr/lib/python2.7/site-packages/Flask_SQLAlchemy-0.15-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Flask-0.8-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Jinja2-2.6-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Werkzeug-0.8.3-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Flask_WTF-0.5.2-py2.7.egg',\n'/usr/lib/python2.7/site-packages/WTForms-0.6.3-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Flask_Mail-0.6.1-py2.7.egg',\n'/usr/lib/python2.7/site-packages/blinker-1.2-py2.7.egg',\n'/usr/lib/python2.7/site-packages/lamson-1.1-py2.7.egg',\n'/usr/lib/python2.7/site-packages/python_daemon-1.6-py2.7.egg',\n'/usr/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg',\n'/usr/lib/python2.7/site-packages/mock-0.8.0-py2.7.egg',\n'/usr/lib/python2.7/site-packages/chardet-1.0.1-py2.7.egg',\n'/usr/lib/python2.7/site-packages/lockfile-0.9.1-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Flask_FlatPages-0.2-py2.7.egg',\n'/usr/lib/python2.7/site-packages/Markdown-2.1.1-py2.7.egg',\n'/usr/lib/python2.7/site-packages/PyYAML-3.10-py2.7-linux-i686.egg',\n'/usr/lib/python2.7/site-packages/uWSGI-1.0.3-py2.7.egg',\n'/usr/lib/python2.7/site-packages/MySQL_python-1.2.3-py2.7-linux-i686.egg',\n'/usr/lib/python27.zip', '/usr/lib/python2.7',\n'/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk',\n'/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload',\n'/usr/lib/python2.7/site-packages',\n'/usr/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg-info']\n\n", "output": "Did you install flaskext.sqlalchemy?  It looks like you have the SQLAlchemy package installed but not the flask Extension.  Try pip install Flask-SQLAlchemy or easy_install Flask-SQLAlchemy.  It is available in the cheeseshop so either utility should pick it up.\n"}
{"input": "How to set Selenium Python WebDriver default timeout? Trying to find a good way to set a maximum time limit for command execution latency in Selenium Python WebDriver.  Ideally, something like:\nmy_driver = get_my_driver()\nmy_driver.set_timeout(30) # seconds\nmy_driver.get('http://www.example.com') # stops / throws exception when time is over 30     seconds\n\nwould work.  I have found .implicitly_wait(30), but I'm not sure if it results in the desired behavior.\nIn case it is useful, we are specifically using the WebDriver for Firefox.\nEDIT\nAs per @amey's answer, this might be useful:\nff = webdriver.Firefox()\nff.implicitly_wait(10) # seconds\nff.get(\"http://somedomain/url_that_delays_loading\")\nmyDynamicElement = ff.find_element_by_id(\"myDynamicElement\")\n\nHowever, it is not clear to me whether the implicit wait applies both to get (which is the desired functionality) and to find_element_by_id.\nThanks very much!\n", "output": "The method to create a timeout for a page to load, in Python, is:\ndriver.set_page_load_timeout(30)\n\nThis will throw a TimeoutException whenever the page load takes more than 30 seconds.\n"}
{"input": "How to convert datetime.timedelta to minutes, hours in Python? I get a start_date like this:\nfrom django.utils.timezone import utc\nimport datetime\n\nstart_date = datetime.datetime.utcnow().replace(tzinfo=utc)\nend_date = datetime.datetime.utcnow().replace(tzinfo=utc)\nduration = end_date - start_date\n\nI get output like this:\ndatetime.timedelta(0, 5, 41038)\n\nHow do I convert this into normal time like:\n10 minutes, 1hour like this\n", "output": "There's no built-in formatter for timedelta objects, but it's pretty easy to do it yourself:\ndays, seconds = duration.days, duration.seconds\nhours = days * 24 + seconds // 3600\nminutes = (seconds % 3600) // 60\nseconds = seconds % 60\n\nOr, equivalently, if you're in Python 2.7+ or 3.2+:\nseconds = duration.total_seconds()\nhours = seconds // 3600\nminutes = (seconds % 3600) // 60\nseconds = seconds % 60\n\nNow you can print it however you want:\n'{} minutes, {} hours'.format(minutes, hours)\n\nFor example:\ndef convert_timedelta(duration):\n    days, seconds = duration.days, duration.seconds\n    hours = days * 24 + seconds // 3600\n    minutes = (seconds % 3600) // 60\n    seconds = (seconds % 60)\n    return hours, minutes, seconds\ntd = datetime.timedelta(2, 7743, 12345)\nhours, minutes, seconds = convert_timedelta(td)\nprint '{} minutes, {} hours'.format(minutes, hours)\n\nThis will print:\n9 minutes, 50 hours\n\nIf you want to get \"10 minutes, 1 hour\" instead of \"10 minutes, 1 hours\", you need to do that manually too:\nprint '{} minute{}, {} hour{}'.format(minutes, 's' if minutes != 1 else '',\n                                      hours, 's' if minutes != 1 else '')\n\nOr you may want to write an english_plural function to do the 's' bits for you, instead of repeating yourself.\nFrom your comments, it sounds like you actually want to keep the days separate. That's even easier:\ndef convert_timedelta(duration):\n    days, seconds = duration.days, duration.seconds\n    hours = seconds // 3600\n    minutes = (seconds % 3600) // 60\n    seconds = (seconds % 60)\n    return days, hours, minutes, seconds\n\nIf you want to convert this to a single value to store in a database, then convert that single value back to format it, do this:\ndef dhms_to_seconds(days, hours, minutes, seconds):\n    return (((days * 24) + hours) * 60 + minutes) * 60 + seconds\n\ndef seconds_to_dhms(seconds):\n    days = seconds // (3600 * 24)\n    hours = (seconds // 3600) % 24\n    minutes = (seconds // 60) % 60\n    seconds = seconds % 60\n    return days, hours, minutes, seconds\n\nSo, putting it together:\ndef store_timedelta_in_database(thingy, duration):\n    seconds = dhms_to_seconds(*convert_timedelta(duration))\n    db.execute('INSERT INTO foo (thingy, duration) VALUES (?, ?)',\n               thingy, seconds)\n    db.commit()\n\ndef print_timedelta_from_database(thingy):\n    cur = db.execute('SELECT duration FROM foo WHERE thingy = ?', thingy)\n    seconds = int(cur.fetchone()[0])\n    days, hours, minutes, seconds = seconds_to_dhms(seconds)\n    print '{} took {} minutes, {} hours, {} days'.format(thingy, minutes, hours, days)\n\n"}
{"input": "What better way to concatenate string in python? Understand \"better\" as a quicker, elegant and readable.\nI have two strings (a and b) that could be null or not. And I want concatenate them separated by a hyphen only if both are not null:\na - b\na (if b is null)\nb (where a is null)\n", "output": "# Concatenates a and b with ' - ' or Coalesces them if one is None\n'-'.join([x for x in (a,b) if x])\n\nEdit\nHere are the results of this algorithm (Note that None will work the same as ''):\n>>> '-'.join([x for x in ('foo','bar') if x])\n'foo-bar'\n>>> '-'.join([x for x in ('foo','') if x])\n'foo'\n>>> '-'.join([x for x in ('','bar') if x])\n'bar'\n>>> '-'.join([x for x in ('','') if x])\n''\n\n*Also note that Rafael's assessment, in his post below, only showed a difference of .0002 secs over a 1000 iterations of the filter method, it can be reasoned that such a small difference can be due to inconsistencies in available system resources at the time of running the script. I ran his timeit implementation over several iteration and found that either algorithm will be faster about 50% of the time, neither by a wide margin. Thus showing they are basically equivalent.\n"}
{"input": "gevent/libevent.h:9:19: fatal error: event.h: No such file or directory I was trying to work on Pyladies website on my local folder. I cloned the repo, (https://github.com/pyladies/pyladies) ! and created the virtual environment. However when I do the pip install -r requirements, I am getting this error\nInstalling collected packages: gevent, greenlet\nRunning setup.py install for gevent\nbuilding 'gevent.core' extension\ngcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -I/opt/local/include -fPIC -I/usr/include/python2.7 -c gevent/core.c -o build/temp.linux-i686-2.7/gevent/core.o\nIn file included from gevent/core.c:253:0:\ngevent/libevent.h:9:19: fatal error: event.h: No such file or directory\ncompilation terminated.\nerror: command 'gcc' failed with exit status 1\nComplete output from command /home/akoppad/virt/pyladies/bin/python -c \"import setuptools;__file__='/home/akoppad/virt/pyladies/build/gevent/setup.py';exec(compile(open(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))\" install --single-version-externally-managed --record /tmp/pip-4MSIGy-record/install-record.txt --install-headers /home/akoppad/virt/pyladies/include/site/python2.7:\nrunning install\n\nrunning build\n\nrunning build_py\n\nrunning build_ext\n\nbuilding 'gevent.core' extension\n\ngcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -I/opt/local/include -fPIC -I/usr/include/python2.7 -c gevent/core.c -o build/temp.linux-i686-2.7/gevent/core.o\n\nIn file included from gevent/core.c:253:0:\n\ngevent/libevent.h:9:19: fatal error: event.h: No such file or directory\n\ncompilation terminated.\n\nerror: command 'gcc' failed with exit status 1\n\n----------------------------------------\nCommand /home/akoppad/virt/pyladies/bin/python -c \"import setuptools;__file__='/home/akoppad/virt/pyladies/build/gevent/setup.py';   exec(compile(open(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))\" install --single-version-externally-managed --record /tmp/pip-4MSIGy-record/install-record.txt --install-headers /home/akoppad/virt/pyladies/include/site/python2.7 failed with error code 1 in /home/akoppad/virt/pyladies/build/gevent\nStoring complete log in /home/akoppad/.pip/pip.log.\n\nI tried doing this, \n    sudo port install libevent\n    CFLAGS=\"-I /opt/local/include -L /opt/local/lib\" pip install gevent\nIt says port command not found. \nI am not sure how to proceed with this. Thanks!\n", "output": "I had the same problem and just as the other answer suggested I had to install \"libevent\". It's apparently not called \"libevent-devel\" anymore (apt-get couldn't find it) but doing: \n    $ apt-cache search libevent\n\nlisted a bunch of available packages. \n    $ apt-get install libevent-dev \n\nworked for me. \n"}
{"input": "python - os.getenv and os.environ don't see environment variables of my bash shell I am on ubuntu 13.04, bash, python2.7.4\nThe interpreter doesn't see variables I set.\nHere is an example:\n$ echo $A\n5\n$ python -c 'import os; print os.getenv( \"A\" )'\nNone\n$ python -c 'import os; print os.environ[ \"A\" ]'\nTraceback (most recent call last):\n  File \"<string>\", line 1, in <module>\n  File \"/usr/lib/python2.7/UserDict.py\", line 23, in __getitem__\n    raise KeyError(key)\nKeyError: 'A'\n\nBut everything works fine with the PATH variable:\n$ echo $PATH \n/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games\n$ python -c 'import os; print os.getenv(\"PATH\")'\n/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games\n\nAnd it notices changes in PATH:\n$ PATH=\"/home/alex/tests/:$PATH\"\n$ echo $PATH \n/home/alex/tests/:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games\n$ python -c 'import os; print os.getenv(\"PATH\")'\n/home/alex/tests/:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games\n\nWhat could be wrong?\nPS  the problem comes when using $PYTHONPATH:\n$ python -c 'import os; print os.getenv(\"PYTHONPATH\")'\nNone\n\n", "output": "Aha! the solution is simple!\nI was setting variables with plain $ A=5 command; when you use $ export B=\"kkk\" everything is fine.\nThat is because export makes the variable available to sub-processes: \n\nit creates a variable in the shell \nand exports it into the environment of the shell\nthe list environment is passed to sub-processes of the shell.\n\nPlain $ A=\"kkk\" just creates variables in the shell and doesn't do anything with the environment.\nThe interpreter called from the shell obtains it's environment from the parent -- the shell. So really the variable should be exported into the environment before.\n"}
{"input": "How to use the same line of code in all functions? I am newbie in Python. \nI wonder if it is possible that all functions inherit the same line of code? \nwith open(filename, 'r') as f: as this line of code is the same in all three functions. Is it possible to inherit the code without using classes?\nI tried to find the answer on stackoverflow and python documentation, but with no luck.\ndef word_count(filename):\n    with open(filename, 'r') as f:\n        return len(f.read().split())\n\n\ndef line_count(filename):\n    with open(filename, 'r') as f:\n        return len(f.read().splitlines())\n\n\ndef character_count(filename):\n    with open(filename, 'r') as f:\n        return len(f.read())\n\n", "output": "The common code in your case is\nwith open(filename, 'r') as f:\n    contents = f.read()\n\nSo just move it to its own function:\ndef get_file_contents(filename):\n    with open(filename, 'r') as f:\n        return f.read()\n\ndef word_count(filename):\n    return len(get_file_contents(filename).split())\n\ndef line_count(filename):        \n    return len(get_file_contents(filename).splitlines())\n\ndef character_count(filename):\n    return len(get_file_contents(filename))\n\n"}
{"input": "What does \"evaluated only once\" mean for chained comparisons in Python? A friend brought this to my attention, and after I pointed out an oddity, we're both confused.\nPython's docs, say, and have said since at least 2.5.1 (haven't checked further back:\n\nComparisons can be chained arbitrarily, e.g., x < y <= z is equivalent to x < y and y <= z, except that y is evaluated only once (but in both cases z is not evaluated at all when x < y is found to be false).\n\nOur confusion lies in the meaning of \"y is evaluated only once\".\nGiven a simple but contrived class:\nclass Magic(object):\n    def __init__(self, name, val):\n    \tself.name = name\n    \tself.val = val\n    def __lt__(self, other):\n    \tprint(\"Magic: Called lt on {0}\".format(self.name))\n    \tif self.val < other.val:\n    \t\treturn True\n    \telse:\n    \t\treturn False\n    def __le__(self, other):\n    \tprint(\"Magic: Called le on {0}\".format(self.name))\n    \tif self.val <= other.val:\n    \t\treturn True\n    \telse:\n    \t\treturn False\n\nWe can produce this result:\n>>> x = Magic(\"x\", 0)\n>>> y = Magic(\"y\", 5)\n>>> z = Magic(\"z\", 10)\n>>> \n>>> if x < y <= z:\n...     print (\"More magic.\")\n... \nMagic: Called lt on x\nMagic: Called le on y\nMore magic.\n>>>\n\nThis certainly looks like 'y' is, in a traditional sense \"evaluated\" twice -- once when x.__lt__(y) is called and performs a comparison on it, and once when y.__le__(z) is called.\nSo with this in mind, what exactly do the Python docs mean when they say \"y is evaluated only once\"?\n", "output": "The 'expression' y is evaluated once. I.e., in the following expression, the function is executed only one time.\n>>> def five():\n...    print 'returning 5'\n...    return 5\n... \n>>> 1 < five() <= 5\nreturning 5\nTrue\n\nAs opposed to:\n>>> 1 < five() and five() <= 5\nreturning 5\nreturning 5\nTrue\n\n"}
{"input": "How can I list the methods in a Python 2.5 module? I'm trying to use a Python library written in C that has no documentation of any kind. I want to use introspection to at least see what methods and classes are in the modules. Does somebody have a function or library I can use to list the functions (with argument lists) and classes (with methods and member variables) within a module? \nI found this article about Python introspection, but I'm pretty sure it doesn't apply to Python 2.5. Thanks for the help.\n", "output": "Here are some things you can do at least:\nimport module\n\nprint dir(module) # Find functions of interest.\n\n# For each function of interest:\nhelp(module.interesting_function)\nprint module.interesting_function.func_defaults\n\n"}
{"input": "Unittest tests order How do I be sure of the unittest methods order? Is the alphabetical or numeric prefixes the proper way?\nclass TestFoo(TestCase):\n    def test_1(self):\n        ...\n    def test_2(self):\n        ...\n\nor\nclass TestFoo(TestCase):\n    def test_a(self):\n        ...\n    def test_b(self):\n        ...\n\n", "output": "There is no reason given that you can't build on what was done in a previous test or should rebuild it all from scratch for the next test. At least no reason is usually offered but instead people just confidently say \"you shouldn't\".  That isn't helpful.\nIn general I am tired of reading too many answers here that say basically \"you shouldn't do that\" instead of giving any information on how to best do it if in the questioners judgment there is good reason to do so.  If I wanted someone's opinion on whether I should do something  then I would have asked for opinions on whether doing it is a good idea. \nThat out of the way, if you read say loadTestsFromTestCase and what it calls it ultimately scans for methods with some name pattern in whatever order they are encountered in the classes  method dictionary, so basically in key order.  It take this information and makes a testsuite of mapping it to the TestCase class.  Giving it instead a list ordered as you would like is one way to do this.  I am not so sure of the most efficient/cleanest way to do it but this does work. \n"}
{"input": "how to stop a for loop I am writing a code to determine if every element in my nxn list is the same. i.e. [[0,0],[0,0]] returns true but [[0,1],[0,0]] will return false. I was thinking of writing a code that stops immediately when it finds an element that is not the same as the first element. i.e:\nn=L[0][0]\nm=len(A)\nfor i in range(m):\n for j in range(m):\n   if\n    L[i][j]==n: -continue the loop-\n   else: -stop the loop-\n\nI would like to stop this loop if L[i][j]!==n  and return false. otherwise return true. How would I go about implementing this?\n", "output": "Use break and continue to do this. Breaking nested loops can be done in Python using the following:\nfor a in range(...):\n   for b in range(..):\n      if some condition:\n         # break the inner loop\n         break\n   else:\n      # will be called if the previous loop did not end with a `break` \n      continue\n   # but here we end up right after breaking the inner loop, so we can\n   # simply break the outer loop as well\n   break\n\nAnother way is to wrap everything in a function and use return to escape from the loop.\n"}
{"input": "Python readline() from a string? In python, is there a built-in way to do a readline() on string?  I have a large chunk of data and want to strip off just the first couple lines w/o doing split() on the whole string.\nHypothetical example:\ndef handleMessage(msg):\n   headerTo  = msg.readline()\n   headerFrom= msg.readline()\n   sendMessage(headerTo,headerFrom,msg)\n\nmsg = \"Bob Smith\\nJane Doe\\nJane,\\nPlease order more widgets\\nThanks,\\nBob\\n\"\nhandleMessage(msg)\n\nI want this to result in:\n sendMessage(\"Bob Smith\", \"Jane Doe\", \"Jane,\\nPlease order...\")\nI know it would be fairly easy to write a class that does this, but I'm looking for something built-in if possible.\nEDIT: Python v2.7\n", "output": "Python 2\nYou can use StringIO:\n>>> msg = \"Bob Smith\\nJane Doe\\nJane,\\nPlease order more widgets\\nThanks,\\nBob\\n\"\n>>> msg\n'Bob Smith\\nJane Doe\\nJane,\\nPlease order more widgets\\nThanks,\\nBob\\n'\n>>> import StringIO\n>>> buf = StringIO.StringIO(msg)\n>>> buf.readline()\n'Bob Smith\\n'\n>>> buf.readline()\n'Jane Doe\\n'\n\nBe sure to use cStringIO if performance is important.\nPython 3\nYou can use io.StringIO:\n>>> import io\n>>> buf = io.StringIO(msg)\n>>> buf.readline()\n'Bob Smith\\n'\n>>> buf.readline()\n'Jane Doe\\n'\n>>> len(buf.read())\n44\n\n"}
{"input": "How to create a user in Django? I'm trying to create a new User in a Django project by the following code, but the highlighted line fires an exception.\ndef createUser(request):\n    userName = request.REQUEST.get('username', None)\n    userPass = request.REQUEST.get('password', None)\n    userMail = request.REQUEST.get('email', None)\n\n    # TODO: check if already existed\n\n    **user = User.objects.create_user(userName, userMail, userPass)**\n    user.save()\n\n    return render_to_response('home.html', context_instance=RequestContext(request))\n\nAny help?\n", "output": "The correct way to create a user in Django is to use the create_user function. This will handle the hashing of the password, etc.. \nfrom django.contrib.auth.models import User\nuser = User.objects.create_user(username='john',\n                                 email='jlennon@beatles.com',\n                                 password='glass onion')\n\n"}
{"input": "Can we have assignment in a condition? Is it possible to have assignment in a condition?\nFor ex.\nif (a=some_func()):\n    # Use a\n\n", "output": "Why not try it out?\n>>> def some_func():\n...   return 2\n... \n>>> a = 2\n>>> if (a = some_func()):\n  File \"<stdin>\", line 1\n    if (a = some_func()):\n          ^\nSyntaxError: invalid syntax\n>>> \n\nSo, no.\n"}
{"input": "How can I get href links from HTML using Python? import urllib2\n\nwebsite = \"WEBSITE\"\nopenwebsite = urllib2.urlopen(website)\nhtml = getwebsite.read()\n\nprint html\n\nSo far so good. \nBut I want only href links from the plain text HTML. How can I solve this problem? \n", "output": "Try with Beautifulsoup:\nfrom BeautifulSoup import BeautifulSoup\nimport urllib2\nimport re\n\nhtml_page = urllib2.urlopen(\"http://www.yourwebsite.com\")\nsoup = BeautifulSoup(html_page)\nfor link in soup.findAll('a'):\n    print link.get('href')\n\nIn case you just want links starting with http://, you should use:\nsoup.findAll('a', attrs={'href': re.compile(\"^http://\")})\n\n"}
{"input": "installing django 1.5(development version) in virtualenv I am trying to start developing a web apps using django 1.5 but I can't figure out how to install the django 1.5 in the virtualenv because django 1.5 is still a development version.\nI've tried installing it using easy_install and pip but all I get is django 1.4 because it is the official version.\nCan anybody help me with this?\n", "output": "Django 1.5 was released!.  To install it use:\npip install django\n\nTo install the latest development version without git, on your own risk use:\npip install https://github.com/django/django/zipball/master\n\nDjango 1.5 Release Candidate 2 (RC2) was released, and it can be installed using pip without installing git:\npip install https://www.djangoproject.com/download/1.5c2/tarball/\n\n\n"}
{"input": "Checking if all the elements in one list are also in another I'm trying to compare two lists and simply print a message if the same value was found in both lists.\ndef listCompare():\n  list1 = [1, 2, 3, 4, 5]\n  list2 = [5, 6, 7, 8, 9]\n  if list1 in list2:\n    print(\"Number was found\")\n  else:\n    print(\"Number not in list\")\n\nThis doesnt work, and I'm not sure of the simplest way to compare the two lists. \n", "output": "You could solve this many ways. One that is pretty simple to understand is to just use a loop.\ndef comp(list1, list2):\n    for val in list1:\n        if val in list2:\n            return True\n    return False\n\nA more compact way you can do it is to use map and reduce:\nreduce(lambda v1,v2: v1 or v2, map(lambda v: v in list2, list1))\n\nEven better, the reduce can be replaced with any:\nany(map(lambda v: v in list2, list1))\n\nYou could also use sets:\nlen(set(list1).intersection(list2)) > 0\n\n"}
{"input": "Output pyodbc cursor results as python dictionary How do I serialize pyodbc cursor output (from .fetchone, .fetchmany or .fetchall) as a Python dictionary?\nI'm using bottlepy and need to return dict so it can return it as JSON.\n", "output": "If you don't know columns ahead of time, use cursor.description to build a list of column names and zip with each row to produce a list of dictionaries. Example assumes connection and query are built:\n>>> cursor = connection.cursor().execute(sql)\n>>> columns = [column[0] for column in cursor.description]\n>>> print columns\n['name', 'create_date']\n>>> results = []\n>>> for row in cursor.fetchall():\n...     results.append(dict(zip(columns, row)))\n...\n>>> print results\n[{'create_date': datetime.datetime(2003, 4, 8, 9, 13, 36, 390000), 'name': u'master'},   \n {'create_date': datetime.datetime(2013, 1, 30, 12, 31, 40, 340000), 'name': u'tempdb'},\n {'create_date': datetime.datetime(2003, 4, 8, 9, 13, 36, 390000), 'name': u'model'},     \n {'create_date': datetime.datetime(2010, 4, 2, 17, 35, 8, 970000), 'name': u'msdb'}]\n\n"}
{"input": "How can I make a scatter plot colored by density in matplotlib? I'd like to make a scatter plot where each point is colored by the spatial density of nearby points.  \nI've come across a very similar question, which shows an example of this using R:\nR Scatter Plot: symbol color represents number of overlapping points\nWhat's the best way to accomplish something similar in python using matplotlib?\n", "output": "In addition to hist2d or hexbin as @askewchan suggested, you can use the same method that the accepted answer in the question you linked to uses.\nIf you want to do that:\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom scipy.stats import gaussian_kde\n\n# Generate fake data\nx = np.random.normal(size=1000)\ny = x * 3 + np.random.normal(size=1000)\n\n# Calculate the point density\nxy = np.vstack([x,y])\nz = gaussian_kde(xy)(xy)\n\nfig, ax = plt.subplots()\nax.scatter(x, y, c=z, s=100, edgecolor='')\nplt.show()\n\n\nIf you'd like the points to be plotted in order of density so that the densest points are always on top (similar to the linked example), just sort them by the z-values.  I'm also going to use a smaller marker size here as it looks a bit better:\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom scipy.stats import gaussian_kde\n\n# Generate fake data\nx = np.random.normal(size=1000)\ny = x * 3 + np.random.normal(size=1000)\n\n# Calculate the point density\nxy = np.vstack([x,y])\nz = gaussian_kde(xy)(xy)\n\n# Sort the points by density, so that the densest points are plotted last\nidx = z.argsort()\nx, y, z = x[idx], y[idx], z[idx]\n\nfig, ax = plt.subplots()\nax.scatter(x, y, c=z, s=50, edgecolor='')\nplt.show()\n\n\n"}
{"input": "Python MYSQL update statement I'm trying to get this Python MYSQL update statement correct(With Variables):\ncursor.execute (\"UPDATE tblTableName SET Year=%s\" % Year \", Month=%s\" % Month \", Day=%s\" % Day \", Hour=%s\" % Hour \", Minute=%s\" Minute \"WHERE Server=%s \" % ServerID)\n\nAny ideas where I'm going wrong?\n", "output": "It should be:\ncursor.execute (\"\"\"\n   UPDATE tblTableName\n   SET Year=%s, Month=%s, Day=%s, Hour=%s, Minute=%s\n   WHERE Server=%s\n\"\"\", (Year, Month, Day, Hour, Minute, ServerID))\n\nYou can also do it with basic string manipulation,\ncursor.execute (\"UPDATE tblTableName SET Year=%s, Month=%s, Day=%s, Hour=%s, Minute=%s WHERE Server='%s' \" % (Year, Month, Day, Hour, Minute, ServerID))\n\nbut this way is discouraged because it leaves you open for SQL Injection. As it's so easy (and similar) to do it the right waytm. Do it correctly. \nThe only thing you should be careful, is that some database backends don't follow the same convention for string replacement (SQLite comes to mind).\n"}
{"input": "How to add key,value pair to dictionary? How to add key,value pair to dictionary?.Below i have mentioned following format?\n{'1_somemessage': [[3L,\n                    1L,\n                    u'AAA',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 22, 30),\n                    u'gffggf'],\n                   [3L,\n                    1L,\n                    u'BBB',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 20, 30),\n                    u'ffgffgfg'],\n                   [3L,\n                    1L,\n                    u'CCC',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 22, 30),\n                    u'hjhjhjhj'],\n                   [3L,\n                    1L,\n                    u'DDD',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 21, 45),\n                    u'jhhjjh']],\n '2_somemessage': [[4L,\n                    1L,\n                    u'AAA',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 22, 30),\n                    u'gffggf'],\n                   [4L,\n                    1L,\n                    u'BBB',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 20, 30),\n                    u'ffgffgfg'],\n                   [4L,\n                    1L,\n                    u'CCC',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 22, 30),\n                    u'hjhjhjhj'],\n                   [4L,\n                    1L,\n                    u'DDD',\n                    1689544L,\n                    datetime.datetime(2010, 9, 21, 21, 45),\n                    u'jhhjjh']]}\n\n", "output": "Add a key, value pair to dictionary \naDict = {}\naDict[key] = value\n\nWhat do you mean by dynamic addition.\n"}
{"input": "Reformat a Python file to have 4 space indentations I'm editing a Python file that uses two spaces for programmatic indents -  I prefer 4 spaces. In my .vimrc I have the following settings related to indentation:\nset tabstop=4                     \"Indentation levels every four columns\nset expandtab                     \"Convert all tabs typed to spaces\nset shiftwidth=4                  \"Indent/outdent by four columns\nset softtabstop=4\n\nHow do I get Vim to convert all the existing 2 space indents to be 4 space indents?\nIn other words:\nif something:\n  dothis()\n\nbecomes\nif something:\n    dothis()\n\nWhen I tried gg=G\ndef check():\n  for a in list:\n    for  b in list2:\n      check(a, b)\n      while (len > MAX) :\n        poll()\n\n  while(len(thelist) > 0) :\n    poll()  \n\n  return results\n\nbecame\ndef check():\n    for a in list:\n    for  b in list2:\n    check(a, b)\n    while (len > MAX) : \n        poll()\n\n        while(len(thelist) > 0) :\n            poll()\n\n            return results\n\n", "output": "In order to double the number of spaces at the beginning of every line (and only at the beginning):\n:%s/^\\s*/&&/g\n\n& in replacement pattern is the matched pattern.\nProbably it will not have any side-effect for you.\n"}
{"input": "Add one year in current date PYTHON I have fetched a date from database with the following variable \n{{ i.operation_date }}\n\nwith which I got a value like \nApril 1, 2013\n\nI need to add one year to the above, so that I can get \nApril 1, 2014\n\nPlease suggest, how can I do this?\n", "output": "AGSM's answer shows a convenient way of solving this problem using the python-dateutil package. But what if you don't want to install that package? You could solve the problem in vanilla Python like this:\nfrom datetime import date\n\ndef add_years(d, years):\n    \"\"\"Return a date that's `years` years after the date (or datetime)\n    object `d`. Return the same calendar date (month and day) in the\n    destination year, if it exists, otherwise use the following day\n    (thus changing February 29 to March 1).\n\n    \"\"\"\n    try:\n        return d.replace(year = d.year + years)\n    except ValueError:\n        return d + (date(d.year + years, 1, 1) - date(d.year, 1, 1))\n\n"}
{"input": "Django app works fine, but getting a TEMPLATE_* warning message When I use runserver, it gives this warning message:\n\n(1_8.W001) The standalone TEMPLATE_* settings were deprecated in\n  Django 1.8 and the TEMPLATES dictionary takes precedence. You must put\n  the values of the following settings into your default TEMPLATES dict:\n  TEMPLATE_DEBUG.\n\nQuoth the Django Documentation: \n\n\"TEMPLATE_DEBUG Deprecated since version 1.8: Set the 'debug' option\n  in the OPTIONS of a DjangoTemplates backend instead.\"\n\nHere is my settings.py with my futile attempts to fix it:\nDEBUG = True\n\nTEMPLATE_DEBUG = DEBUG\n\nTEMPLATES = [\n    {\n        'BACKEND': 'django.template.backends.django.DjangoTemplates',\n        'DIRS': [os.path.join(BASE_DIR, 'myapp/templates')],\n        'APP_DIRS': True,\n        'OPTIONS': {\n            'context_processors': [\n                'django.template.context_processors.debug',\n                'django.template.context_processors.request',\n                'django.contrib.auth.context_processors.auth',\n                'django.contrib.messages.context_processors.messages',\n            ],\n            'debug': DEBUG,\n            'DEBUG': DEBUG,\n            'TEMPLATE_DEBUG': DEBUG\n        },\n    }, ]\n\nWhat am I missing here?\n", "output": "Set debug in OPTIONS dictionary of your templates settings.\nDEBUG = True\n\nTEMPLATES = [\n    {\n        ...\n        'OPTIONS': {\n            'debug': DEBUG,\n        },\n    },\n]\n\nThen remove this line from your settings to stop the warnings\nTEMPLATE_DEBUG = DEBUG\n\n"}
{"input": "Spoofing the origination IP address of an HTTP request This only needs to work on a single subnet and is not for malicious use.  \nI have a load testing tool written in Python that basically blasts HTTP requests at a URL.  I need to run performance tests against an IP-based load balancer, so the requests must come from a range of IP's.  Most commercial performance tools provide this functionality, but I want to build it into my own.\nThe tool uses Python's urllib2 for transport.  Is it possible to send HTTP requests with spoofed IP addresses for the packets making up the request?\n", "output": "This is a misunderstanding of HTTP. The HTTP protocol is based on top of TCP. The TCP protocol relies on a 3 way handshake to initialize requests.\n\nNeedless to say, if you spoof your originating IP address, you will never get past the synchronization stage and no HTTP information will be sent (the server can't send it to a legal host).\nIf you need to test an IP load balancer, this is not the way to do it.\n"}
{"input": "Python: remove dictionary from list If I have a list of dictionaries, say:\n[{'id': 1, 'name': 'paul'},\n{'id': 2, 'name': 'john'}]\n\nand I would like to remove the dictionary with id of 2 (or name john), what is the most efficient way to go about this programmatically (that is to say, I don't know the index of the entry in the list so it can't simply be popped).\n", "output": "thelist[:] = [d for d in thelist if d.get('id') != 2]\n\nEdit: as some doubts have been expressed in a comment about the performance of this code (some based on misunderstanding Python's performance characteristics, some on assuming beyond the given specs that there is exactly one dict in the list with a value of 2 for key 'id'), I wish to offer reassurance on this point.\nOn an old Linux box, measuring this code:\n$ python -mtimeit -s\"lod=[{'id':i, 'name':'nam%s'%i} for i in range(99)]; import random\" \"thelist=list(lod); random.shuffle(thelist); thelist[:] = [d for d in thelist if d.get('id') != 2]\"\n10000 loops, best of 3: 82.3 usec per loop\n\nof which about 57 microseconds for the random.shuffle (needed to ensure that the element to remove is not ALWAYS at the same spot;-) and 0.65 microseconds for the initial copy (whoever worries about performance impact of shallow copies of Python lists is most obviously out to lunch;-), needed to avoid altering the original list in the loop (so each leg of the loop does have something to delete;-).\nWhen it is known that there is exactly one item to remove, it's possible to locate and remove it even more expeditiously:\n$ python -mtimeit -s\"lod=[{'id':i, 'name':'nam%s'%i} for i in range(99)]; import random\" \"thelist=list(lod); random.shuffle(thelist); where=(i for i,d in enumerate(thelist) if d.get('id')==2).next(); del thelist[where]\"\n10000 loops, best of 3: 72.8 usec per loop\n\n(use the next builtin rather than the .next method if you're on Python 2.6 or better, of course) -- but this code breaks down if the number of dicts that satisfy the removal condition is not exactly one. Generalizing this, we have:\n$ python -mtimeit -s\"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*3; import random\" \"thelist=list(lod); where=[i for i,d in enumerate(thelist) if d.get('id')==2]; where.reverse()\" \"for i in where: del thelist[i]\"\n10000 loops, best of 3: 23.7 usec per loop\n\nwhere the shuffling can be removed because there are already three equispaced dicts to remove, as we know. And the listcomp, unchanged, fares well:\n$ python -mtimeit -s\"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*3; import random\" \"thelist=list(lod); thelist[:] = [d for d in thelist if d.get('id') != 2]\"\n10000 loops, best of 3: 23.8 usec per loop\n\ntotally neck and neck, with even just 3 elements of 99 to be removed. With longer lists and more repetitions, this holds even more of course:\n$ python -mtimeit -s\"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*133; import random\" \"thelist=list(lod); where=[i for i,d in enumerate(thelist) if d.get('id')==2]; where.reverse()\" \"for i in where: del thelist[i]\"\n1000 loops, best of 3: 1.11 msec per loop\n$ python -mtimeit -s\"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*133; import random\" \"thelist=list(lod); thelist[:] = [d for d in thelist if d.get('id') != 2]\"\n1000 loops, best of 3: 998 usec per loop\n\nAll in all, it's obviously not worth deploying the subtlety of making and reversing the list of indices to remove, vs the perfectly simple and obvious list comprehension, to possibly gain 100 nanoseconds in one small case -- and lose 113 microseconds in a larger one;-). Avoiding or criticizing simple, straightforward, and perfectly performance-adequate solutions (like list comprehensions for this general class of \"remove some items from a list\" problems) is a particularly nasty example of Knuth's and Hoare's well-known thesis that \"premature optimization is the root of all evil in programming\"!-)\n"}
{"input": "Why aren't \"and\" and \"or\" operators in Python? I wasn't aware of this, but apparently the and and or keywords aren't operators.  They don't appear in the list of python operators.  Just out of sheer curiosity, why is this?  And if they aren't operators, what exactly are they?\n", "output": "Because they're control flow constructs. Specifically:\n\nif the left argument to and evaluates to False, the right argument doesn't get evaluated at all\nif the left argument to or evaluates to True, the right argument doesn't get evaluated at all\n\nThus, it is not simply a matter of being reserved words. They don't behave like operators, since operators always evaluate all of their arguments.\nYou can contrast this with bitwise binary operators which, as the name implies, are operators:\n>>> 1 | (1/0)\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nZeroDivisionError: integer division or modulo by zero\n>>> 1 or (1/0)\n1\n\nAs you see, the bitwise OR (|) evaluates both its arguments. The or keyword, however, doesn't evaluate its right argument at all when the left argument evaluates to True; that's why no ZeroDivisionError is raised in the second statement.\n"}
{"input": "Start index for iterating Python list What is the best way to set a start index when iterating a list in Python. For example, I have a list of the days of the week - Sunday, Monday, Tuesday, ... Saturday - but I want to iterate through the list starting at Monday. What is the best practice for doing this?\n", "output": "You can use slicing:\nfor item in list[2:]:\n    # do stuff\n\nThis will start at the third element and iterate to the end.\n"}
{"input": "Python title() with apostrophes Is there a way to use .title() to get the correct output from a title with apostrophes? For example:\n\"john's school\".title() --> \"John'S School\"\n\nHow would I get the correct title here, \"John's School\" ?\n", "output": "If your titles do not contain several whitespace characters in a row (which would be collapsed), you can use string.capwords() instead:\n>>> import string\n>>> string.capwords(\"john's school\")\n\"John's School\"\n\nEDIT: As Chris Morgan rightfully says below, you can alleviate the whitespace collapsing issue by specifying \" \" in the sep argument:\n>>> string.capwords(\"john's    school\", \" \")\n\"John's    School\"\n\n"}
{"input": "python numpy array of arbitrary length strings I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length.  I.e. you can take a string str and keeping adding to it: str += \"some stuff...\".  Is there a way to make an array of such strings?\nWhen I try this, each element only stores a single character\nstrArr = numpy.empty(10, dtype='string')\nfor i in range(0,10)\n    strArr[i] = \"test\"\n\nOn the other hand, I know I can initialize an array of certain length strings, i.e.\nstrArr = numpy.empty(10, dtype='s256')\n\nwhich can store 10 strings of up to 256 characters\n", "output": "You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:\n>>> a = numpy.array(['apples', 'foobar', 'cowboy'])\n>>> a[2] = 'bananas'\n>>> a\narray(['apples', 'foobar', 'banana'], \n      dtype='|S6')\n\nBut when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:\n>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)\n>>> a\narray([apples, foobar, cowboy], dtype=object)\n>>> a[2] = 'bananas'\n>>> a\narray([apples, foobar, bananas], dtype=object)\n\nIndeed, because it's an array of objects, you can assign any kind of python object to the array:\n>>> a[2] = {1:2, 3:4}\n>>> a\narray([apples, foobar, {1: 2, 3: 4}], dtype=object)\n\nHowever, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:\n>>> a = numpy.array(['abba' for _ in range(10000)])\n>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)\n>>> %timeit a.copy()\n100000 loops, best of 3: 2.51 us per loop\n>>> %timeit b.copy()\n10000 loops, best of 3: 48.4 us per loop\n\n"}
{"input": "How can i parse a comma delimited string into a list (caveat)? I need to be able to take a string like:\n'''foo, bar, \"one, two\", three four'''\n\ninto:\n['foo', 'bar', 'one, two', 'three four']\n\nI have an feeling (with hints from #python) that the solution is going to involve the shlex module.\n", "output": "It depends how complicated you want to get... do you want to allow more than one type of quoting.  How about escaped quotes?\nYour syntax looks very much like the common CSV file format, which is supported by the Python standard library:\nimport csv\nreader = csv.reader(['''foo, bar, \"one, two\", three four'''], skipinitialspace=True)\nfor r in reader:\n  print r\n\nOutputs:\n['foo', 'bar', 'one, two', 'three four']\n\nHTH!\n"}
{"input": "How do I re.search or re.match on a whole file without reading it all into memory? I want to be able to run a regular expression on an entire file, but I'd like to be able to not have to read the whole file into memory at once as I may be working with rather large files in the future. Is there a way to do this? Thanks!\nClarification: I cannot read line-by-line because it can span multiple lines.\n", "output": "You can use mmap to map the file to memory. The file contents can then be accessed like a normal string:\nimport re, mmap\n\nwith open('/var/log/error.log', 'r+') as f:\n  data = mmap.mmap(f.fileno(), 0)\n  mo = re.search('error: (.*)', data)\n  if mo:\n    print \"found error\", mo.group(1)\n\nThis also works for big files, the file content is internally loaded from disk as needed.\n"}
{"input": "How to install pywin32 module in windows 7 I am trying to install pywin32 module. I downloaded it from sourceforge.net. When i run\nsetup.py install\n\nIt showed \"Unable to find vcvarsall.bat\". I googled about it and found that I have to install mingW and set path then run \npython setup.py build --compiler=mingw32\n\nbut it's showing \nCan't find a version in Windows.h\n\nI tried to google it but can't solve the problem. Can anybody help me in this?\n(I have visual studio 2010 ultimate trial version installed and mingw c and c++ compilor installed.)  operating system : windows 7.\nCan anybody tell me a source from where I can download python binaries already compiled in mingW , so I don't have to do all this. Or if I can get a python which already have windows modules. Thanks in advance.\n", "output": "You can install pywin32 wheel packages from PYPI with PIP by pointing to this package: https://pypi.python.org/pypi/pypiwin32 No need to worry about first downloading the package, just use pip:\npip install pypiwin32\n\nCurrently I think this is \"the easiest\" way to get in working :) Hope this helps.\n"}
{"input": "Inheritance and init method in Python I'm begginer of python. I can't understand inheritance and init().\nclass Num:\n    def __init__(self,num):\n        self.n1 = num\n\nclass Num2(Num):\n    def show(self):\n        print self.n1\n\nmynumber = Num2(8)\nmynumber.show()\n\nRESULT: 8\nThis is OK.But I replace Num2 with\nclass Num2(Num):\n    def __init__(self,num):\n        self.n2 = num*2\n    def show(self):\n        print self.n1,self.n2\n\nRESULT: Error. Num2 has no attribute \"n1\".\nIn this case, how can Num2 access \"n1\"?\n", "output": "In the first situation, Num2 is extending the class Num and since you are not redefining the special method named __init__() in Num2, it gets inherited from Num.\n\nWhen a class defines an __init__()\n  method, class instantiation\n  automatically invokes __init__() for\n  the newly-created class instance.\n\nIn the second situation, since you are redefining __init__() in Num2 you need to explicitly call the one in the super class (Num) if you want to extend its behavior.\nclass Num2(Num):\n    def __init__(self,num):\n        Num.__init__(self,num)\n        self.n2 = num*2\n\n"}
{"input": "Load module from string in python I have some code in the form of a string and would like to make a module out of it without writing to disk.\nWhen I try using imp and a StringIO object to do this, I get:\n>>> imp.load_source('my_module', '', StringIO('print \"hello world\"'))\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nTypeError: load_source() argument 3 must be file, not instance\n>>> imp.load_module('my_module', StringIO('print \"hello world\"'), '', ('', '', 0))\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nValueError: load_module arg#2 should be a file or None\n\nHow can I create the module without having an actual file?  Alternatively, how can I wrap a StringIO in a file without writing to disk?\nUPDATE:\nNOTE: This issue is also a problem in python3.\nThe code I'm trying to load is only partially trusted.  I've gone through it with ast and determined that it doesn't import anything or do anything I don't like, but I don't trust it enough to run it when I have local variables running around that could get modified, and I don't trust my own code to stay out of the way of the code I'm trying to import.\nI created an empty module that only contains the following:\ndef load(code):\n    # Delete all local variables\n    globals()['code'] = code\n    del locals()['code']\n\n    # Run the code\n    exec(globals()['code'])\n\n    # Delete any global variables we've added\n    del globals()['load']\n    del globals()['code']\n\n    # Copy k so we can use it\n    if 'k' in locals():\n        globals()['k'] = locals()['k']\n        del locals()['k']\n\n    # Copy the rest of the variables\n    for k in locals().keys():\n        globals()[k] = locals()[k]\n\nThen you can import mymodule and call mymodule.load(code).  This works for me because I've ensured that the code I'm loading does not use globals.  Also, the global keyword is only a parser directive and can't refer to anything outside of the exec.\nThis really is way too much work to import the module without writing to disk, but if you ever want to do this, I believe it's the best way.\n", "output": "Here is how to import a string as a module (Python 2.x): \nimport sys,imp\n\nmy_code = 'a = 5'\nmymodule = imp.new_module('mymodule')\nexec my_code in mymodule.__dict__\n\nIn Python 3, exec is a function, so this should work: \nimport sys,imp\n\nmy_code = 'a = 5'\nmymodule = imp.new_module('mymodule')\nexec(my_code, mymodule.__dict__)\n\nNow access the module attributes (and functions, classes etc) as:\nprint(mymodule.a)\n>>> 5\n\nTo ignore any next attempt to import, add the module to sys:\nsys.modules['mymodule'] = mymodule\n\n"}
{"input": "Python: \"TypeError: __str__ returned non-string\" but still prints to output? I have this piece of code which creates a new note..WHen I try to print I get the following error even though it prints the output\nError:\nC:\\Python27\\Basics\\OOP\\formytesting>python notebook.py\nMemo=This is my first memo, Tag=example\nTraceback (most recent call last):\n  File \"notebook.py\", line 14, in <module>\n    print(firstnote)\nTypeError: __str__ returned non-string (type NoneType)\n\nnote.py\nimport datetime\nclass Note:\n    def __init__(self, memo, tags):\n        self.memo = memo\n        self.tags = tags\n        self.creation_date = datetime.date.today()\n\n    def __str__(self):\n        print('Memo={0}, Tag={1}').format(self.memo, self.tags)\n\n\nif __name__ == \"__main__\":\n    firstnote = Note('This is my first memo','example')\n    print(firstnote)\n\n", "output": "Method __str__ should return string, not print.\ndef __str__(self):\n    return 'Memo={0}, Tag={1}'.format(self.memo, self.tags)\n\n"}
{"input": "How to find the cumulative sum of numbers in a list? time_interval=[4,6,12]\n\nI want to sum up the numbers like [4+0, 4+6, 4+6+12] in order to get the list t=[4,10,22]. I tried:\nx=0\nfor i in (time_interval):\n    t1=time_interval[0]\n    t2=time_interval[1]+t1\n    t3=time_interval[2]+t2\n    print(t1,t2,t3)\n\n4 10 22\n4 10 22\n4 10 22\n\n", "output": "If you're doing much numerical work with arrays like this, I'd suggest numpy, which comes with a cumulative sum function cumsum:\nimport numpy as np\n\na = [4,6,12]\n\nnp.cumsum(a)\n#array([4, 10, 22])\n\nNumpy is often faster than pure python for this kind of thing, see in comparison to @Ashwini's accumu:\nIn [136]: timeit list(accumu(range(1000)))\n10000 loops, best of 3: 161 us per loop\n\nIn [137]: timeit list(accumu(xrange(1000)))\n10000 loops, best of 3: 147 us per loop\n\nIn [138]: timeit np.cumsum(np.arange(1000))\n100000 loops, best of 3: 10.1 us per loop\n\nBut of course if it's the only place you'll use numpy, it might not be worth having a dependence on it.\n"}
{"input": "Python: Pass or Sleep for long running processes? I am writing an queue processing application which uses threads for waiting on and responding to queue messages to be delivered to the app.  For the main part of the application, it just needs to stay active.  For a code example like:\n\nwhile True:\n  pass\n\nor\n\nwhile True:\n  time.sleep(1)\n\nWhich one will have the least impact on a system?  What is the preferred way to do nothing, but keep a python app running?\n", "output": "I would imagine time.sleep() will have less overhead on the system. Using pass will cause the loop to immediately re-evaluate and peg the CPU, whereas using time.sleep will allow the execution to be temporarily suspended.\nEDIT: just to prove the point, if you launch the python interpreter and run this: \n>>> while True:\n...     pass\n...\n\nYou can watch Python start eating up 90-100% CPU instantly, versus: \n>>> import time \n>>> while True:\n...     time.sleep(1)\n...\n\nWhich barely even registers on the Activity Monitor (using OS X here but it should be the same for every platform).\n"}
{"input": "Determining what version of Flask is installed What's the easiest way to determine which version of Flask is installed?\n", "output": "As of flask 0.7 (June 28th, 2011), a __version__ attribute can be found on the flask module.\n>> import flask\n>> flask.__version__\n\nKeep in mind that because prior to flask 0.7 there was no __version__ attribute, the preceding code will result in an attribute error on those older versions.\nFor versions older than flask 0.7, you might be able to determine it using pkg_resources as shown below:\n\n>>> import pkg_resources\n>>> pkg_resources.get_distribution('flask').version\n'0.6.1'\n\nThis won't work 100% though. It depends on the user having the pkg_resources library installed (it might come by default with a Linux distribution's python installation, but since it's not part of the standard library you can't be positive), and also that the user installed flask in a way that pkg_resources can find it (for example, just copying the full flask source code into your directory puts it out of the range of pkg_resources).\n"}
{"input": "Django returns 403 error when sending a POST request when I'm using following Python code to send a POST request to my Django website I'm getting 403: Forbidden error.\nurl = 'http://www.sub.domain.com/'\nvalues = { 'var': 'test' }\n\ntry:\n    data = urllib.urlencode(values, doseq=True)\n    req = urllib2.Request(url, data)\n    response = urllib2.urlopen(req)\n    the_page = response.read()\nexcept:\n    the_page = sys.exc_info()\n    raise\n\nWhen I'm opening any other website it works properly.\ndomain.com is Django website too, and it works properly too.\nI think, that's Django config problem, can anyone tell me what should I do to provide access to my script?\n", "output": "Look here https://docs.djangoproject.com/en/dev/ref/csrf/#how-to-use-it.\nTry marking your view with @csrf_exempt. That way, Django's CSRF middleware will ignore CSRF protection. You'll also need to use from django.views.decorators.csrf import csrf_exempt. See: https://docs.djangoproject.com/en/dev/ref/csrf/#utilities\nPlease be advised that by disabling CSRF protection on your view, you are opening a gate for CSRF attacks.\nIf security is vital to you then consider using @csrf_exempt followed by @requires_csrf_token (see: https://docs.djangoproject.com/en/dev/ref/csrf/#unprotected-view-needs-the-csrf-token). Then, in your script pass this token and that's it.\n"}
{"input": "interprocess communication in python What is a clean and elegant way to do interprocess communication between two different python processes?  I currently use named pipes in the OS, but it feels a bit hacky.  I rewrote my stuff with dbus services, which worked, but it seems when running the code remotely through an SSH session it now tries to initialise X11 which seems completely unnecessary for the things I want to do (they aren't GUI related).  So maybe dbus is a bit too heavyweight.  I was about to redesign again using sockets, but it seems quite low-level so I thought there might be a higher level module I could import and use which I simply don't know the name of, and I thought I should ask on SO first..\nMy requirement is to be able to run python foo.py and have that process just doing it's thing there, like a daemon, and be able to send messages to it with python foo.py --bar.  The latter call should just sends a message to the existing process and terminates, possibly with a return code 0 for success or other for failure (so some two-way communication will be required).  \n", "output": "The multiprocessing library provides listeners and clients that wrap sockets and allow you to pass arbitrary python objects.\nYour server could listen to receive python objects:\nfrom multiprocessing.connection import Listener\n\naddress = ('localhost', 6000)     # family is deduced to be 'AF_INET'\nlistener = Listener(address, authkey='secret password')\nconn = listener.accept()\nprint 'connection accepted from', listener.last_accepted\nwhile True:\n    msg = conn.recv()\n    # do something with msg\n    if msg == 'close':\n        conn.close()\n        break\nlistener.close()\n\nYour client could send commands as objects:\nfrom multiprocessing.connection import Client\n\naddress = ('localhost', 6000)\nconn = Client(address, authkey='secret password')\nconn.send('close')\n# can also send arbitrary objects:\n# conn.send(['a', 2.5, None, int, sum])\nconn.close()\n\n"}
{"input": "Add number to set What am I doing wrong here?\na = set().add(1)\nprint a # Prints `None`\n\nI'm trying to add the number 1 to the empty set.\n", "output": "It is a convention in Python that methods that mutate sequences return None.\nConsider:\n>>> a_list = [3, 2, 1]\n>>> print a_list.sort()\nNone\n>>> a_list\n[1, 2, 3]\n\n>>> a_dict = {}\n>>> print a_dict.__setitem__('a', 1)\nNone\n>>> a_dict\n{'a': 1}\n\n>>> a_set = set()\n>>> print a_set.add(1)\nNone\n>>> a_set\nset([1])\n\nSome may consider this convention \"a horrible misdesign in Python\", but the Design and History FAQ gives the reasoning behind this design decision (with respect to lists):\n\nWhy doesn\u00e2\u0080\u0099t list.sort() return the sorted list?\nIn situations where performance matters, making a copy of the list\n  just to sort it would be wasteful. Therefore, list.sort() sorts the\n  list in place. In order to remind you of that fact, it does not return\n  the sorted list. This way, you won\u00e2\u0080\u0099t be fooled into accidentally\n  overwriting a list when you need a sorted copy but also need to keep\n  the unsorted version around.\nIn Python 2.4 a new built-in function \u00e2\u0080\u0093 sorted() \u00e2\u0080\u0093 has been added.\n  This function creates a new list from a provided iterable, sorts it\n  and returns it.\n\nYour particular problems with this feature come from a misunderstanding of good ways to create a set rather than a language misdesign.  As Lattyware points out, in Python versions 2.7 and later you can use a set literal a = {1} or do a = set([1]) as per Sven Marnach's answer.\nParenthetically, I like Ruby's convention of placing an exclamation point after methods that mutate objects, but I find Python's approach acceptable.\n"}
{"input": "Why won't Django use IPython? (myvenv)me:src orokusaki$ python manage.py shell -i ipython\nPython 2.7.2 (default, Jun 16 2012, 12:38:40) \n[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n(InteractiveConsole)\n>>> while True:\n...     pass  # :(\n...\n\nI have IPython installed globally, and it works while this myvenv virtualenv is activated just fine. However, if I start the Django shell (with or without -i), it won't use IPython. I've never had this problem before.\nNote: django-admin.py vs manage.py won't make a difference.\n", "output": "Try installing it into virtualenv! :-)\n"}
{"input": "Can you change a field label in the Django Admin application? As the title suggests. I want to be able to change the label of a single field in the admin application. I'm aware of the Form.field attribute, but how do I get my Model or ModelAdmin to pass along that information?\n", "output": "the verbose name of the field is the (optional) first parameter at field construction.\n"}
{"input": "How to fetch a non-ascii url with Python urlopen? I need to fetch data from a URL with non-ascii characters but urllib2.urlopen refuses to open the resource and raises:\nUnicodeEncodeError: 'ascii' codec can't encode character u'\\u0131' in position 26: ordinal not in range(128)\n\nI know the URL is not standards compliant but I have no chance to change it. \nWhat is the way to access a resource pointed by a URL containing non-ascii characters using Python?\nedit: In other words, can / how urlopen open a URL like:\nhttp://example.org/\u00c3\u0091\u00c3\u00b6\u00c3\u00b1-\u00c3\u0085\u00c5\u009e\u00c3\u0087\u00c4\u00b0\u00c4\u00b0/\n\n", "output": "Strictly speaking URIs can't contain non-ASCII characters; what you have there is an IRI.\nTo convert an IRI to a plain ASCII URI:\n\nnon-ASCII characters in the hostname part of the address have to be encoded using the Punycode-based IDNA algorithm;\nnon-ASCII characters in the path, and most of the other parts of the address have to be encoded using UTF-8 and %-encoding, as per Ignacio's answer.\n\nSo:\nimport re, urlparse\n\ndef urlEncodeNonAscii(b):\n    return re.sub('[\\x80-\\xFF]', lambda c: '%%%02x' % ord(c.group(0)), b)\n\ndef iriToUri(iri):\n    parts= urlparse.urlparse(iri)\n    return urlparse.urlunparse(\n        part.encode('idna') if parti==1 else urlEncodeNonAscii(part.encode('utf-8'))\n        for parti, part in enumerate(parts)\n    )\n\n>>> iriToUri(u'http://www.a\\u0131b.com/a\\u0131b')\n'http://www.xn--ab-hpa.com/a%c4%b1b'\n\n(Technically this still isn't quite good enough in the general case because urlparse doesn't split away any user:pass@ prefix or :port suffix on the hostname. Only the hostname part should be IDNA encoded. It's easier to encode using normal urllib.quote and .encode('idna') at the time you're constructing a URL than to have to pull an IRI apart.)\n"}
{"input": "List database tables with SQLAlchemy I want to implement a function that gives information about all the tables (and their column names) that are present in a database (not only those created with SQLAlchemy). While reading the documentation it seems to me that this is done via reflection but I didn't manage to get something working. Any suggestions or examples on how to do this?\n", "output": "start with an engine:\nfrom sqlalchemy import create_engine\nengine = create_engine(\"postgresql://u:p@host/database\")\n\nquick path to all table /column names, use an inspector:\nfrom sqlalchemy import inspect\ninspector = inspect(engine)\n\nfor table_name in inspector.get_table_names():\n   for column in inspector.get_columns(table_name):\n       print(\"Column: %s\" % column['name'])\n\ndocs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html?highlight=inspector#fine-grained-reflection-with-inspector\nalternatively, use MetaData / Tables:\nfrom sqlalchemy import MetaData\nm = MetaData()\nm.reflect(engine)\nfor table in m.tables.values():\n    print(table.name)\n    for column in table.c:\n        print(column.name)\n\ndocs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html#reflecting-all-tables-at-once\n"}
{"input": "Optimize the performance of dictionary membership for a list of Keys I am trying to write a code which should return true if any element of list is present in a dictionary. Performance of this piece is really important. I know I can just loop over list and break if I find the first search hit. Is there any faster or more Pythonic way for this than given below?\nfor x in someList:\n     if x in someDict:\n           return True\nreturn False\n\nEDIT: I am using Python 2.7. My first preference would be a faster method. \n", "output": "Use of builtin any can have some performance edge over two loops\nany(x in someDict for x in someList)\n\nbut you might need to measure your mileage. If your list and dict remains pretty static and you have to perform the comparison multiple times, you may consider using set\nsomeSet = set(someList) \nsomeDict.viewkeys() & someSet \n\nNote Python 3.X, by default returns views rather than a sequence, so it would be straight forward when using Python 3.X\nsomeSet = set(someList) \nsomeDict.keys() & someSet \n\nIn both the above cases you can wrap the result with a bool to get a boolean result\nbool(someDict.keys() & set(someSet ))\n\nHeretic Note\nMy curiosity got the better of me and I timed all the proposed solutions. It seems that your original solution is better performance wise. Here is the result\nSample Randomly generated Input\ndef test_data_gen():\n    from random import sample\n    for i in range(1,5):\n        n = 10**i\n        population = set(range(1,100000))\n        some_list = sample(list(population),n)\n        population.difference_update(some_list)\n        some_dict = dict(zip(sample(population,n),\n                             sample(range(1,100000),n)))\n        yield \"Population Size of {}\".format(n), (some_list, some_dict), {}\n\nThe Test Engine\nI rewrote the Test Part of the answer as it was messy and the answer was receiving quite a decent attention. I created a timeit compare python module and moved it onto github\nThe Test Result\nTimeit repeated for 10 times\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n======================================\nTest Run for Population Size of 10\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.000011  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_any      |0.000014  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_not_not  |0.000015  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_imap_any         |0.000018  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_any              |0.000019  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_ifilter_next     |0.000022  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.000024  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.000047  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 100\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_any      |0.000071  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_nested           |0.000072  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_not_not  |0.000073  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_next     |0.000076  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_imap_any         |0.000082  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.000092  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.000170  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.000638  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 1000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_not_not  |0.000746  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_any      |0.000746  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_next     |0.000752  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_nested           |0.000771  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_set_ashwin       |0.000838  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_imap_any         |0.000842  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_any              |0.000933  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.001702  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 10000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.007195  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_next     |0.007410  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_any      |0.007491  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_not_not  |0.007671  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_set_ashwin       |0.008385  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_imap_any         |0.011327  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_any              |0.011533  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.018313  |some_dict.viewkeys() & set(some_list )\nTimeit repeated for 100 times\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n======================================\nTest Run for Population Size of 10\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.000098  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_any      |0.000124  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_not_not  |0.000131  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_imap_any         |0.000142  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_ifilter_next     |0.000151  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.000158  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.000186  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.000496  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 100\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_any      |0.000661  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_not_not  |0.000677  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_nested           |0.000683  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_next     |0.000684  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_imap_any         |0.000762  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.000854  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.001291  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.005018  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 1000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_any      |0.007585  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_nested           |0.007713  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_set_ashwin       |0.008256  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_not_not  |0.008526  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_any              |0.009422  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_ifilter_next     |0.010259  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_imap_any         |0.011414  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.019862  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 10000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_imap_any         |0.082221  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_any      |0.083573  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_nested           |0.095736  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_set_ashwin       |0.103427  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_ifilter_next     |0.104589  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_ifilter_not_not  |0.117974  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_any              |0.127739  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.208228  |some_dict.viewkeys() & set(some_list )\nTimeit repeated for 1000 times\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n======================================\nTest Run for Population Size of 10\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.000953  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_any      |0.001134  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_not_not  |0.001213  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_next     |0.001340  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_imap_any         |0.001407  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.001535  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.002252  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.004701  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 100\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_any      |0.006209  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_next     |0.006411  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_not_not  |0.006657  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_nested           |0.006727  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_imap_any         |0.007562  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.008262  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.012260  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.046773  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 1000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_not_not  |0.071888  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_next     |0.072150  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_nested           |0.073382  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_any      |0.075698  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_set_ashwin       |0.077367  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_imap_any         |0.090623  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_any              |0.093301  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.177051  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 10000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.701317  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_next     |0.706156  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_any      |0.723368  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_not_not  |0.746650  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_set_ashwin       |0.776704  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_imap_any         |0.832117  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_any              |0.881777  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |1.665962  |some_dict.viewkeys() & set(some_list )\nTimeit repeated for 10000 times\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n======================================\nTest Run for Population Size of 10\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.010581  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_any      |0.013512  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_imap_any         |0.015321  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_ifilter_not_not  |0.017680  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_ifilter_next     |0.019334  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.026274  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.030881  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.053605  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 100\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_nested           |0.070194  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_not_not  |0.078524  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_any      |0.079499  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_imap_any         |0.087349  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_ifilter_next     |0.093970  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_any              |0.097948  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_set_ashwin       |0.130725  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |0.480841  |some_dict.viewkeys() & set(some_list )\n======================================\nTest Run for Population Size of 1000\n======================================\n|Rank  |FunctionName         |Result    |Description\n+------+---------------------+----------+-----------------------------------------------\n|     1|foo_ifilter_any      |0.754491  |any(ifilter(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     2|foo_ifilter_not_not  |0.756253  |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     3|foo_ifilter_next     |0.771382  |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+----------+-----------------------------------------------\n|     4|foo_nested           |0.787152  |Original OPs Code\n+------+---------------------+----------+-----------------------------------------------\n|     5|foo_set_ashwin       |0.818520  |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+----------+-----------------------------------------------\n|     6|foo_imap_any         |0.902947  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+----------+-----------------------------------------------\n|     7|foo_any              |1.001810  |any(x in some_dict for x in some_list)\n+------+---------------------+----------+-----------------------------------------------\n|     8|foo_set              |2.012781  |some_dict.viewkeys() & set(some_list )\n=======================================\nTest Run for Population Size of 10000\n=======================================\n|Rank  |FunctionName         |Result     |Description\n+------+---------------------+-----------+-----------------------------------------------\n|     1|foo_imap_any         |10.071469  |any(imap(some_dict.__contains__, some_list))\n+------+---------------------+-----------+-----------------------------------------------\n|     2|foo_any              |11.127034  |any(x in some_dict for x in some_list)\n+------+---------------------+-----------+-----------------------------------------------\n|     3|foo_set              |18.881414  |some_dict.viewkeys() & set(some_list )\n+------+---------------------+-----------+-----------------------------------------------\n|     4|foo_nested           |8.731133   |Original OPs Code\n+------+---------------------+-----------+-----------------------------------------------\n|     5|foo_ifilter_not_not  |9.019190   |not not next(ifilter(some_dict.__contains__...\n+------+---------------------+-----------+-----------------------------------------------\n|     6|foo_ifilter_next     |9.189966   |bool(next(ifilter(some_dict.__contains__...\n+------+---------------------+-----------+-----------------------------------------------\n|     7|foo_set_ashwin       |9.363886   |not set(some_dct).isdisjoint(some_lst)\n+------+---------------------+-----------+-----------------------------------------------\n|     8|foo_ifilter_any      |9.442759   |any(ifilter(some_dict.__contains__, some_list))\n\nAnd a Graphical Comparison from the above referred module\n\n\n\n\nConclusion\nPremature optimization is evil. It is evident that none of the solutions have optimal performance when the test domain varies. Depending on population size and frequency of iteration, performance of solutions varies considerably. The result again speaks out about the fact that in Python, one should ensure that the code should be readable rather than ensuring that the code is either nifty or optimized for performance for certain cases, but then it may not be scalable.\nNote There were some doubts on why not using ifilter performs better than the rest \n\n\"In Abhit's answer, he timed the different approaches and found that\n  ifilter/next was not the fastest; any idea why this would be the case?\n  \"\n\nIt is a known fact that in python, there is an overhead when calling C functions, and if the population size is low but the frequency of iteration is high, the accumulation of C function call overhead would slowly show up. As can be seen in the graphs, where population size is low but iteration is high, using ifilter, performance deviates considerably. \n"}
{"input": "Nonalphanumeric list order from os.listdir() in Python I often use python to process directories of data. Recently, I have noticed that the default order of the lists has changed to something almost nonsensical. For example, if I am in a current directory containing the following subdirectories: run01, run02, ... run19, run20, and then I generate a list from the following command:\ndir = os.listdir(os.getcwd())\n\nthen I usually get a list in this order:\ndir = ['run01', 'run18', 'run14', 'run13', 'run12', 'run11', 'run08', ... ]\n\nand so on. The order used to be alphanumeric. But this new order has remained with me for a while now.\nWhat is determining the (displayed) order of these lists? Is there some sort of rule here? Is it the result of some setting I obliviously changed?\nEDIT: Thank you for the suggestions everyone, does anyone know where/how the filesystem determines this order? I've determined that this only occurs on a MacFUSE mounted drive.\n", "output": "You can use the builtin sorted function to sort the strings however you want.  Based on what you describe, \nsorted(os.listdir(whatever_directory))\n\nAlternatively, you can use the .sort method of a list:\nlst = os.listdir(whatever_directory)\nlst.sort()\n\nI think should do the trick.\nNote that the order that os.listdir gets the filenames is probably completely dependent on your filesystem.\n"}
{"input": "Automatically import models on Django shell launch I'm tired of typing from account_import.models import ImportFile every time I open my Django shell. Is there a way to automatically run this command whenever I fire up my shell?\n", "output": "install django-extensions, one of the commands it features (shell_plus) is providing the context for your models.\nhttps://github.com/django-extensions/django-extensions\nSo, instead of ./manage.py shell you can use ./manage.py shell_plus so that everything is imported.\n"}
{"input": "Slicing a list in Python without generating a copy I have the following problem. \n\nGiven a list of integers L, I need to generate all of the sublists L[k:] for k in [0, len(L) - 1], without generating copies. \n\nHow do I accomplish this in Python? With a buffer object somehow?\n", "output": "Slicing lists does not generate copies of the objects in the list; it just copies the references to them. \nI had a moment of uncertainty about immutable objects; but I checked and indeed, only the reference is copied:\n>>> a = [1, 2, 3]\n>>> b = a[1:3]\n>>> id(a[1])\n143727964\n>>> id(b[0])\n143727964\n\n"}
{"input": "Why is equivalent Python code so much slower can somebody explain why is the following trivial code (implementation of Euclid's algorithm to find greatest common denominator) about 3 times slower then equivalent code in Ruby ?\ncontents of iter_gcd.py:\nfrom sys import argv,stderr\n\ndef gcd(m, n):\n    if n > m:\n        m, n = n, m\n    while n != 0:\n        rem = m % n\n        m = n\n        n = rem\n    return m\n\n# in Python3 code there is xrange replaced with range function\ndef main(a1, a2):\n    comp = 0\n    for j in xrange(a1, 1, -1):\n        for i in xrange(1, a2):\n            comp += gcd(i,j)\n\n    print(comp)\n\nif __name__ == '__main__':\n    if len(argv) != 3:\n        stderr.write('usage: {0:s} num1 num2\\n'.format(argv[0]))\n        exit(1)\n    else:\n        main(int(argv[1]), int(argv[2]))\n\ncontents of iter_gcd.rb:\ndef gcd(m, n)\n    while n != 0\n        rem = m % n\n        m = n\n        n = rem\n    end\n    return m\nend\n\ndef main(a1, a2)\n    comp = 0\n    a1.downto 2 do\n        |j|\n        1.upto (a2 - 1) do\n            |i|\n            comp += gcd(i,j)\n        end\n    end\n    puts comp\nend\n\n if __FILE__ == $0\n    if ARGV.length != 2\n        $stderr.puts('usage: %s num1 num2' % $0)\n        exit(1)\n    else\n        main(ARGV[0].to_i, ARGV[1].to_i)\n    end\nend\n\nExecution times measurements:\n$ time python iter_gcd.py 4000 3000\n61356305\n\nreal    0m22.890s\nuser    0m22.867s\nsys     0m0.006s\n\n$ python -V\nPython 2.6.4\n\n\n$ time python3 iter_gcd.py 4000 3000\n61356305\n\nreal    0m18.634s\nuser    0m18.615s\nsys     0m0.009s\n\n$ python3 -V\nPython 3.1.2\n\n\n$ time ruby iter_gcd.rb 4000 3000\n61356305\n\nreal    0m7.619s\nuser    0m7.616s\nsys     0m0.003s\n\n$ ruby -v\nruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]\n\nJust curious why I got such results. I considered CPython to be faster in most cases then MRI and even the new Ruby 1.9 on YARV but this \"microbenchmark\" did really surprised me.\nBtw, I know I can use specialised library function like fractions.gcd but I'd like to compare implementations of such basic and trivial language constructs.\nDid I miss something or is the implementation of the next Ruby generation so much improved in area of sheer speed ?\n", "output": "Summary\n\"Because the function call overhead in Python is much larger than in Ruby.\"\nDetails\nBeing a microbenchmark, this really doesn't say much about the performance of either language in proper use. Likely you would want to rewrite the program to take advantage of the strengths of Python and Ruby, but this does illustrate one of the weak points of Python at the moment. The root cause of the speed differences come from function call overhead. I made a few tests to illustrate. See below for code and more details. For the Python tests, I used 2000 for both gcd parameters.\nInterpreter: Python 2.6.6\nProgram type: gcd using function call\nTotal CPU time: 29.336 seconds\n\nInterpreter: Python 2.6.6\nProgram type: gcd using inline code\nTotal CPU time: 13.194 seconds\n\nInterpreter: Python 2.6.6\nProgram type: gcd using inline code, with dummy function call\nTotal CPU  time: 30.672 seconds\n\nThis tells us that it's not the calculation made by the gcd function that contributes most to the time difference, it's the function call itself. With Python 3.1, the difference is similar:\nInterpreter: Python 3.1.3rc1\nProgram type: gcd using function call\nTotal CPU time: 30.920 seconds\n\nInterpreter: Python 3.1.3rc1\nProgram type: gcd using inline code\nTotal CPU time: 15.185 seconds\n\nInterpreter: Python 3.1.3rc1\nProgram type: gcd using inline code, with dummy function call\nTotal CPU time: 33.739 seconds\n\nAgain, the actual calculation is not biggest contributor, it's the function call itself. In Ruby, the function call overhead is much smaller. (Note: I had to use smaller parameters (200) for the Ruby version of the programs because the Ruby profiler really slows down real-time performance. That doesn't affect CPU time performance, though.)\nInterpreter: ruby 1.9.2p0 (2010-08-18 revision 29036) [i486-linux]\nProgram type: gcd using function call\nTotal CPU time: 21.66 seconds\n\nInterpreter: ruby 1.9.2p0 (2010-08-18 revision 29036) [i486-linux]\nProgram type: gcd using inline code\nTotal CPU time: 21.31 seconds\n\nInterpreter: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]\nProgram type: gcd using function call\nTotal CPU time: 27.00 seconds\n\nInterpreter: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]\nProgram type: gcd using inline code\nTotal CPU time: 24.83 seconds\n\nNotice how neither Ruby 1.8 nor 1.9 suffer greatly from the gcd function call \u00e2\u0080\u0093 the function call vs. inline version are more or less equal. Ruby 1.9 seems to be a little better with less difference between the function call and inline versions.\nSo the answer to the question is: \"because the function call overhead in Python is much larger than in Ruby\".\nCode\n# iter_gcd -- Python 2.x version, with gcd function call\n#             Python 3.x version uses range instead of xrange\nfrom sys import argv,stderr\n\ndef gcd(m, n):\n    if n > m:\n        m, n = n, m\n    while n != 0:\n        rem = m % n\n        m = n\n        n = rem\n    return m\n\ndef main(a1, a2):\n    comp = 0\n    for j in xrange(a1, 1, -1):\n        for i in xrange(1, a2):\n            comp += gcd(i,j)\n    print(comp)\n\nif __name__ == '__main__':\n    if len(argv) != 3:\n        stderr.write('usage: {0:s} num1 num2\\n'.format(argv[0]))\n        exit(1)\n    else:\n        main(int(argv[1]), int(argv[2]))\n\n\n# iter_gcd -- Python 2.x version, inline calculation\n#             Python 3.x version uses range instead of xrange\nfrom sys import argv,stderr\n\ndef main(a1, a2):\n    comp = 0\n    for j in xrange(a1, 1, -1):\n        for i in xrange(1, a2):\n            if i < j:\n                m, n = j, i\n            else:\n                m, n = i, j\n            while n != 0:\n                rem = m % n\n                m = n\n                n = rem\n            comp += m\n    print(comp)\n\nif __name__ == '__main__':\n    if len(argv) != 3:\n        stderr.write('usage: {0:s} num1 num2\\n'.format(argv[0]))\n        exit(1)\n    else:\n        main(int(argv[1]), int(argv[2]))\n\n\n# iter_gcd -- Python 2.x version, inline calculation, dummy function call\n#             Python 3.x version uses range instead of xrange\nfrom sys import argv,stderr\n\ndef dummyfunc(n, m):\n    a = n + m\n\ndef main(a1, a2):\n    comp = 0\n    for j in xrange(a1, 1, -1):\n        for i in xrange(1, a2):\n            if i < j:\n                m, n = j, i\n            else:\n                m, n = i, j\n            while n != 0:\n                rem = m % n\n                m = n\n                n = rem\n            comp += m\n            dummyfunc(i, j)\n    print(comp)\n\nif __name__ == '__main__':\n    if len(argv) != 3:\n        stderr.write('usage: {0:s} num1 num2\\n'.format(argv[0]))\n        exit(1)\n    else:\n        main(int(argv[1]), int(argv[2]))\n\n\n# iter_gcd -- Ruby version, with gcd function call\n\ndef gcd(m, n)\n    if n > m\n        m, n = n, m\n    end\n    while n != 0\n        rem = m % n\n        m = n\n        n = rem\n    end\n    return m\nend\n\ndef main(a1, a2)\n    comp = 0\n    a1.downto 2 do\n        |j|\n        1.upto a2-1 do\n            |i|\n            comp += gcd(i,j)\n        end\n    end\n    puts comp\nend\n\n if __FILE__ == $0\n    if ARGV.length != 2\n        $stderr.puts('usage: %s num1 num2' % $0)\n        exit(1)\n    else\n        main(ARGV[0].to_i, ARGV[1].to_i)\n    end\nend\n\n\n# iter_gcd -- Ruby version, with inline gcd\n\ndef main(a1, a2)\n    comp = 0\n    a1.downto 2 do |j|\n        1.upto a2-1 do |i|\n            m, n = i, j\n            if n > m\n                m, n = n, m\n            end\n            while n != 0\n                rem = m % n\n                m = n\n                n = rem\n            end\n            comp += m\n        end\n    end\n    puts comp\nend\n\n if __FILE__ == $0\n    if ARGV.length != 2\n        $stderr.puts('usage: %s num1 num2' % $0)\n        exit(1)\n    else\n        main(ARGV[0].to_i, ARGV[1].to_i)\n    end\nend\n\n\nTest runs\nFinally, the commands used to run Python and Ruby with profiling to get the numbers for comparison were pythonX.X -m cProfile iter_gcdX.py 2000 2000 for Python and rubyX.X -rprofile iter_gcdX.rb 200 200 for Ruby. The reason for the difference is that the Ruby profiler adds a lot of overhead. The results are still valid because I'm comparing the difference between a function call and inline code, not the difference between Python and Ruby as such.\nSee also\nWhy is python slower compared to Ruby even with this very simple \u00e2\u0080\u009ctest\u00e2\u0080\u009d?\nIs there something wrong with this python code, why does it run so slow compared to ruby?\nThe Computer Language Benchmarks Game\nGoogle Search: ruby python function call faster\n"}
{"input": "error: Setup script exited with error: command 'gcc' failed with exit status 1 I get the following error when I try to install MySQL-python-1.2.3 under Python 2.6 in Fedora 14. \nFedora 14 comes with Python 2.7 by default and I am working in a project which runs in Python 2.6, So I am not in a position to update Python from 2.6 to 2.7.\n_mysql.c:35:23: fatal error: my_config.h: No such file or directory\ncompilation terminated.\nerror: command 'gcc' failed with exit status 1\n\nThe complete error message is as below\n[root@localhost MySQL-python-1.2.2]# python setup.py build\nrunning build\nrunning build_py\ncreating build\ncreating build/lib.linux-i686-2.6\ncopying _mysql_exceptions.py -> build/lib.linux-i686-2.6\ncreating build/lib.linux-i686-2.6/MySQLdb\ncopying MySQLdb/__init__.py -> build/lib.linux-i686-2.6/MySQLdb\ncopying MySQLdb/converters.py -> build/lib.linux-i686-2.6/MySQLdb\ncopying MySQLdb/connections.py -> build/lib.linux-i686-2.6/MySQLdb\ncopying MySQLdb/cursors.py -> build/lib.linux-i686-2.6/MySQLdb\ncopying MySQLdb/release.py -> build/lib.linux-i686-2.6/MySQLdb\ncopying MySQLdb/times.py -> build/lib.linux-i686-2.6/MySQLdb\ncreating build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/__init__.py -> build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/CR.py -> build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/FIELD_TYPE.py -> build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/ER.py -> build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/FLAG.py -> build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/REFRESH.py -> build/lib.linux-i686-2.6/MySQLdb/constants\ncopying MySQLdb/constants/CLIENT.py -> build/lib.linux-i686-2.6/MySQLdb/constants\nrunning build_ext\nbuilding '_mysql' extension\ncreating build/temp.linux-i686-2.6\ngcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Dversion_info=(1,2,2,'final',0) -D__version__=1.2.2 -I/usr/include/mysql -I/usr/local/include/python2.6 -c _mysql.c -o build/temp.linux-i686-2.6/_mysql.o -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -fasynchronous-unwind-tables -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -fno-strict-aliasing -fwrapv -fPIC -DUNIV_LINUX\n_mysql.c:35:23: fatal error: my_config.h: No such file or directory\ncompilation terminated.\nerror: command 'gcc' failed with exit status 1\n[root@localhost MySQL-python-1.2.2]# python2.6 setup.py build\nrunning build\nrunning build_py\ncopying MySQLdb/release.py -> build/lib.linux-i686-2.6/MySQLdb\nrunning build_ext\nbuilding '_mysql' extension\ngcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Dversion_info=(1,2,2,'final',0) -D__version__=1.2.2 -I/usr/include/mysql -I/usr/local/include/python2.6 -c _mysql.c -o build/temp.linux-i686-2.6/_mysql.o -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -fasynchronous-unwind-tables -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -fno-strict-aliasing -fwrapv -fPIC -DUNIV_LINUX\n_mysql.c:35:23: fatal error: my_config.h: No such file or directory\ncompilation terminated.\nerror: command 'gcc' failed with exit status 1\n\n", "output": "you need to install MySQL Development package\nyum install mysql-devel\n\n:D\n"}
{"input": "How to detect if numpy is installed I'm writing Python code. I want to check if numpy and wxpython are installed on machine. How to do that??\n", "output": "You can try importing them and then handle the ImportError if the module doesn't exist.\ntry:\n    import numpy\nexcept ImportError:\n    print \"numpy is not installed\"\n\n"}
{"input": "How do I use matplotlib autopct? I'd like to create a matplotlib pie chart which has the value of each wedge written on top of the wedge.\nThe documentation suggests I should use autopct to do this. \n\nautopct: [ None | format string |\n  format function ]\n      If not None, is a string or function used to label the wedges with\n  their numeric value. The label will be\n  placed inside the wedge. If it is a\n  format string, the label will be\n  fmt%pct. If it is a function, it will\n  be called.\n\nUnfortunately, I'm unsure what this format string or format function is supposed to be.\nUsing this basic example below, how can I display each numerical value on top of its wedge?\nplt.figure()\nvalues = [3, 12, 5, 8] \nlabels = ['a', 'b', 'c', 'd'] \nplt.pie(values, labels=labels) #autopct??\nplt.show()\n\n", "output": "autopct enables you to display the percent value using Python string formatting. For example, if autopct='%.2f', then for each pie wedge, the format string is '%.2f' and the numerical percent value for that wedge is pct, so the wedge label is set to the string '%.2f'%pct. \nimport matplotlib.pyplot as plt\nplt.figure()\nvalues = [3, 12, 5, 8] \nlabels = ['a', 'b', 'c', 'd'] \nplt.pie(values, labels=labels, autopct='%.2f')\nplt.show()\n\nyields\n\nYou can do fancier things by supplying a callable to autopct. To display both the percent value and the original value, you could do this:\nimport matplotlib.pyplot as plt\n\n# make the pie circular by setting the aspect ratio to 1\nplt.figure(figsize=plt.figaspect(1))\nvalues = [3, 12, 5, 8] \nlabels = ['a', 'b', 'c', 'd'] \n\ndef make_autopct(values):\n    def my_autopct(pct):\n        total = sum(values)\n        val = int(round(pct*total/100.0))\n        return '{p:.2f}%  ({v:d})'.format(p=pct,v=val)\n    return my_autopct\n\nplt.pie(values, labels=labels, autopct=make_autopct(values))\nplt.show()\n\n\nAgain, for each pie wedge, matplotlib supplies the percent value pct as the argument, though this time it is sent as the argument to the function my_autopct. The wedge label is set to my_autopct(pct).\n"}
{"input": "Accessing module level variables, from within a function in the module I'd like to be able to do something like this:\n#mymodule\nvar = None\n\ndef load():\n    var = something()\n\nOther module(s):\n#secondmodule\nimport mymodule\nmymodule.load()\n\n#thirdmodule\nfrom mymodule import var\nprint var #Shouldn't be None\n\nBut I don't know how to reference a modules variable from a function defined in the module.\nIs this possible? Or am I going to need to put a global declaration in every place I wan't to use this.\nOr am I going at this completely wrong?\n", "output": "Just change \ndef load():\n    global var\n    var = something()\n\nGlobal variables are read-only from sibling methods. More accurately unless a variable is specified as global, Python consider it as local, but a read access to a local variable name will reach module-level scope if the name is not present in local scope.\nSee also use of \u00e2\u0080\u009cglobal\u00e2\u0080\u009d keyword in python and the doc for more details about the global statement\n"}
{"input": "defaultdict(None) I wish to have a dictionary which contains a set of state transitions. I presumed that I could do this using states = defaultdict(None), but its not working as I expected. For example:\nstates = defaultdict(None)\nif new_state_1 != states[\"State 1\"]:\n    dispatch_transition()\n\nI would have thought that states[\"State 1\"] would return the value None and that if new_state is a bool that I would have gotten False for new_state != states[\"State 1\"], but instead I get a KeyError.\nWhat am i doing wrong?\nThanks,\nBarry\n", "output": "defaultdict requires a callable as argument that provides the default-value when invoked without arguments. None is not callable. What you want is this:\ndefaultdict(lambda: None)\n\n"}
{"input": "Determining how many times a substring occurs in a string in Python I am trying to figure out how many times a string occurs in a string. For example:\nnStr = '000123000123'\n\nSay the string I want to find is 123. Obviously it occurs twice in nStr but I am having trouble implementing this logic into Python. What I have got at the moment:\npattern = '123'\ncount = a = 0\nwhile pattern in nStr[a:]:\n    a = nStr[a:].find(pattern)+1\n    count += 1\nreturn count\n\nThe answer it should return is 2. I'm stuck in an infinite loop at the moment.\nI was just made aware that count is a much better way to do it but out of curiosity, does anyone see a way to do it similar to what I have already got?\n", "output": "Use str.count:\n>>> nStr = '000123000123'\n>>> nStr.count('123')\n2\n\nA working version of your code:\nnStr = '000123000123'\npattern = '123'\ncount =0\nflag=True\nstart=0\nwhile flag:\n    a = nStr.find(pattern,start)  # find() returns -1 if the word is not found, \n                                  #start i the starting index from the search starts(default value is 0)\n    if a==-1:          #if pattern not found set flag to False\n        flag=False\n    else:               # if word is found increase count and set starting index to a+1\n        count+=1        \n        start=a+1\nprint(count)\n\n"}
{"input": "GAE SDK 1.7.4 and InvalidCertificateException Recently, I upgraded my GAE SDK to ver. 1.7.4 and it started to throw 'InvalidCertificateException' when I try to run development server. I searched about this error and some people said it goes away with time, but mine didn't. What should I look into to fix this problem? I am using python framework Django for my app if that has to matter somehow.\n$ dev_appserver.py ./\nINFO     2012-12-16 07:44:31,412 appcfg.py:586] Checking for updates to the SDK.\nTraceback (most recent call last):\n  File \"/usr/local/bin/dev_appserver.py\", line 171, in <module>\n    run_file(__file__, globals())\n  File \"/usr/local/bin/dev_appserver.py\", line 167, in run_file\n    execfile(script_path, globals_)\n  File \"/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver_main.py\", line 747, in <module>\n    sys.exit(main(sys.argv))\n  File \"/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/dev_appserver_main.py\", line 680, in main\n    update_check.CheckForUpdates()\n  File \"/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py\", line 597, in CheckForUpdates\n    runtime=self.config.runtime)\n  File \"/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appengine_rpc.py\", line 391, in Send\n    f = self.opener.open(req)\n  File \"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py\", line 394, in open\n    response = self._open(req, data)\n  File \"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py\", line 412, in _open\n    '_open', req)\n  File \"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py\", line 372, in _call_chain\n    result = func(*args)\n  File \"/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py\", line 1207, in https_open\n    return self.do_open(httplib.HTTPSConnection, req)\n  File \"/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/fancy_urllib/fancy_urllib/__init__.py\", line 379, in do_open\n    url_error.reason.args[1])\nfancy_urllib.InvalidCertificateException: Host appengine.google.com returned an invalid certificate (_ssl.c:503: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed): \nTo learn more, see http://code.google.com/appengine/kb/general.html#rpcssl\n\n", "output": "Quick workaround that I found: delete the file google_appengine/lib/cacerts/cacerts.txt from your installed SDK.\nStarting from the GoogleAppEngineLauncher:\n\nGoogleAppEngineLauncher/Contents/Resources/GoogleAppEngineDefault.bundle/Content\u00e2\u0080\u008c\u00e2\u0080\u008bs/Resources/google_appengine/lib/cacerts/cacerts.txt\n\nEDIT #\n\nas of google app engine SDK 1.8.1 this file as been renamed to\n  urlfetch_cacerts.txt. Still in the same directory and removing it\n  still fixes the problem.\n\n\u00e2\u0080\u0093 @Harrison\n"}
{"input": "How to flatten only some dimensions of a numpy array Is there a quick way to \"sub-flatten\" or flatten only some of the first dimensions in a numpy array?\nFor example, given a numpy array of dimensions (50,100,25), the resultant dimensions would be (5000,25)\nThanks\n", "output": "A slight generalization to Alexander's answer - np.reshape can take -1 as an argument, meaning \"total array size divided by product of all other listed dimensions\":\ne.g. to flatten all but the last dimension:\n>>> arr = numpy.zeros((50,100,25))\n>>> new_arr = arr.reshape(-1, arr.shape[-1])\n>>> new_arr.shape\n# (5000, 25)\n\n"}
{"input": "What is the difference between `sorted(list)` vs `list.sort()` ? python list.sort() sorts the list and save the sorted list, while sorted(list) returns a sorted list without changing the original list.\n\nBut when to use which?  \nAnd which is faster? And how much faster?\nCan a list's original positions be retrieved after list.sort()?\n\n", "output": "sorted() returns a new sorted list, leaving the original list unaffected. list.sort() sorts the list in-place, mutating the list indices, and returns None (like all in-place operations).\nsorted() works on any iterable, not just lists. Strings, tuples, dictionaries (you'll get the keys), generators, etc., returning a list containing all elements, sorted.\n\nUse list.sort() when you want to mutate the list, sorted() when you want a new sorted object back. Use sorted() when you want to sort something that is an iterable, not a list yet.\nFor lists, list.sort() is faster than sorted() because it doesn't have to create a copy. For any other iterable, you have no choice.\nNo, you cannot retrieve the original positions. Once you called list.sort() the original order is gone.\n\n"}
{"input": "Quicksort sorts larger numbers faster? I was messing around with Python trying to practice my sorting algorithms and found out something interesting.\nI have three different pieces of data:  \n\nx = number of numbers to sort  \ny = range the numbers are in (all random generated ints)  \nz = total time taken to sort  \n\nWhen:\nx = 100000 and\ny = (0,100000) then\nz =  0.94182094911 sec\nWhen:\nx = 100000 and\ny = (0,100) then\nz = 12.4218382537 sec\nWhen:\nx = 100000 and\ny = (0,10) then\nz =  110.267447809 sec\nAny ideas?\nCode:\nimport time\nimport random\nimport sys\n\n#-----Function definitions\n\ndef quickSort(array): #random pivot location quicksort. uses extra memory.\n    smaller = []\n    greater = []\n    if len(array) <= 1:\n        return array\n    pivotVal = array[random.randint(0, len(array)-1)]\n    array.remove(pivotVal)\n    for items in array:\n        if items <= pivotVal:\n            smaller.append(items)\n        else:\n            greater.append(items)\n    return concat(quickSort(smaller), pivotVal, quickSort(greater))\n\ndef concat(before, pivot, after):\n    new = []\n    for items in before:\n        new.append(items)\n    new.append(pivot)\n    for things in after:\n        new.append(things)\n    return new\n\n#-----Variable definitions\nlist = []\niter = 0\nsys.setrecursionlimit(20000)\nstart = time.clock() #start the clock\n\n#-----Generate the list of numbers to sort\nwhile(iter < 100000):\n    list.append(random.randint(0,10))  #modify this to change sorting speed\n    iter = iter + 1\ntimetogenerate = time.clock() - start #current timer - last timer snapshot\n\n#-----Sort the list of numbers\nlist = quickSort(list)\ntimetosort = time.clock() - timetogenerate #current timer - last timer snapshot\n\n#-----Write the list of numbers\nfile = open(\"C:\\output.txt\", 'w')\nfor items in list:\n    file.write(str(items))\n    file.write(\"\\n\")\nfile.close()\ntimetowrite = time.clock() - timetosort #current timer - last timer snapshot\n\n#-----Print info\nprint \"time to start: \" + str(start)\nprint \"time to generate: \" + str(timetogenerate)\nprint \"time to sort: \" + str(timetosort)\nprint \"time to write: \" + str(timetowrite)\ntotaltime = timetogenerate + timetosort + start\nprint \"total time: \" + str(totaltime)\n\n-------------------revised NEW code----------------------------  \ndef quickSort(array): #random pivot location quicksort. uses extra memory.\n    smaller = []\n    greater = []\n    equal = []\n    if len(array) <= 1:\n        return array\n    pivotVal = array[random.randint(0, len(array)-1)]\n    array.remove(pivotVal)\n    equal.append(pivotVal)\n    for items in array:\n        if items < pivotVal:\n            smaller.append(items)\n        elif items > pivotVal:\n            greater.append(items)\n        else:\n            equal.append(items)\n    return concat(quickSort(smaller), equal, quickSort(greater))\n\ndef concat(before, equal, after):\n    new = []\n    for items in before:\n        new.append(items)\n    for items in equal:\n        new.append(items)\n    for items in after:\n        new.append(items)\n    return new\n\n", "output": "I think this has to do with the choice of a pivot.  Depending on how your partition step works, if you have a lot of duplicate values, your algorithm can degenerate to quadratic behavior when confronted with many duplicates.  For example, suppose that you're trying to quicksort this stream:\n [0 0 0 0 0 0 0 0 0 0 0 0 0]\n\nIf you aren't careful with how you do the partitioning step, this can degenerate quickly.  For example, suppose you pick your pivot as the first 0, leaving you with the array\n [0 0 0 0 0 0 0 0 0 0 0 0]\n\nto partition.  Your algorithm might say that the smaller values are the array\n [0 0 0 0 0 0 0 0 0 0 0 0]\n\nAnd the larger values are the array\n []\n\nThis is the case that causes quicksort to degenerate to O(n2), since each recursive call is only shrinking the size of the input by one (namely, by pulling off the pivot element).\nI noticed that in your code, your partitioning step does indeed do this:\nfor items in array:\n    if items <= pivotVal:\n        smaller.append(items)\n    else:\n        greater.append(items)\n\nGiven a stream that's a whole bunch of copies of the same element, this will put all of them into one array to recursively sort.\nOf course, this seems like a ridiculous case - how is this at all connected to reducing the number of values in the array? - but it actually does come up when you're sorting lots of elements that aren't distinct.  In particular, after a few passes of the partitioning, you're likely to group together all equal elements, which will bring you into this case.\nFor a discussion of how to prevent this from happening, there's a really great talk by Bob Sedgewick and Jon Bentley about how to modify the partition step to work quickly when in the presence of duplicate elements.  It's connected to Dijkstra's Dutch national flag problem, and their solutions are really clever.\nOne option that works is to partition the input into three groups - less, equal, and greater.  Once you've broken the input up this way, you only need to sort the less and greater groups; the equal groups are already sorted.  The above link to the talk shows how to do this more or less in-place, but since you're already using an out-of-place quicksort the fix should be easy.  Here's my attempt at it:\nfor items in array:\n    if items < pivotVal:\n        smaller.append(items)\n    elif items == pivotVal:\n        equal.append(items)\n    else:\n        greater.append(items)\n\nI've never written a line of Python in my life, BTW, so this may be totally illegal syntax.  But I hope the idea is clear!  :-)\n"}
{"input": "Python IndentationError: unexpected indent I really can't see the indentation error here, i'm getting crazy with this ->\n# loop\nwhile d <= end_date:\n    # print d.strftime(\"%Y%m%d\")\n    fecha = d.strftime(\"%Y%m%d\")\n    # set url\n    url = 'http://www.wpemergencia.omie.es//datosPub/marginalpdbc/marginalpdbc_' + fecha + '.1'\n    # Descargamos fichero\n    response = urllib2.urlopen(url)\n    # Abrimos fichero\n    output = open(fname,'wb')\n    # Escribimos fichero\n    output.write(response.read())\n    # Cerramos y guardamos fichero\n    output.close()\n    # fecha++\n    d += delta\n\n", "output": "Run your program with\npython -t script.py\n\nThis will warn you if you have mixed tabs and spaces.\nOn *nix systems, you can see where the tabs are by running\ncat -A script.py\n\nand you can automatically convert tabs to 4 spaces with the command\nexpand -t 4 script.py > fixed_script.py\n\nPS. Be sure to use a programming editor (e.g. emacs, vim), not a word processor, when programming. You won't get this problem with a programming editor.\nPPS. For emacs users, M-x whitespace-mode will show the same info as cat -A from within an emacs buffer!\n"}
{"input": "Passing STATIC_URL to file javascript with django Which is the best solution to pass {{ STATIC_URL }} to javascript files?\nI'm working with django and python.\nThanks in advance.\nRegards.\n", "output": "Using a global javascript variable with the static url value is more simple :\n<script language=\"javascript\">var STATIC_URL = \"{{ STATIC_URL|escapejs }}\";</script>\n<script src=\"{{ STATIC_URL }}js/myfile.js\"></script>\n\nThen, you can simply use the static url by calling STATIC_URL in myfile.js :\nhtml = '<img src=\"'+STATIC_URL+'/icons/flags/tn.gif\">';\n\n"}
{"input": "django-debug-toolbar breaking on admin while getting sql stats Environment:django debug toolbar breaking while using to get sql stats else it's working fine on the other pages, breaking only on the pages which have sql queries.\nRequest Method: GET\nRequest URL: http://www.blog.local/admin/\n\nDjango Version: 1.9.7\nPython Version: 2.7.6\nInstalled Applications:\n[\n ....\n 'django.contrib.staticfiles',\n 'debug_toolbar']\nInstalled Middleware:\n[\n  ...\n 'debug_toolbar.middleware.DebugToolbarMiddleware']\n\nTraceback:\n\nFile \"/home/vagrant/www/dx/venv/local/lib/python2.7/site-packages/django/core/handlers/base.py\" in get_response\n  235.                 response = middleware_method(request, response)\n\nFile \"/home/vagrant/www/dx/venv/local/lib/python2.7/site-packages/debug_toolbar/middleware.py\" in process_response\n  129.                 panel.generate_stats(request, response)\n\nFile \"/home/vagrant/www/dx/venv/local/lib/python2.7/site-packages/debug_toolbar/panels/sql/panel.py\" in generate_stats\n  192.                     query['sql'] = reformat_sql(query['sql'])\n\nFile \"/home/vagrant/www/dx/venv/local/lib/python2.7/site-packages/debug_toolbar/panels/sql/utils.py\" in reformat_sql\n  27.     return swap_fields(''.join(stack.run(sql)))\n\nFile \"/home/vagrant/www/dx/venv/local/lib/python2.7/site-packages/sqlparse/engine/filter_stack.py\" in run\n  29.             stream = filter_.process(stream)\n\nException Type: TypeError at /admin/\n Exception Value: process() takes exactly 3 arguments (2 given)\n\n", "output": "sqlparse latest version was released today and it's not compatible with django-debug-toolbar version 1.4, Django version 1.9\nworkaround is force pip to install sqlparse==0.1.19\n"}
{"input": "Algorithm for neatly indenting SQL statements (Python implementation would be nice) I'd like to reformat some SQL statements that are a single string with newlines in to something that's much easier to read.\nI don't personally know of a good coding style for indenting SQL - how should nested queries / where clauses / left joins / etc by represented to maximise readability?\nHas anyone seen a pretty-printing algorithm that does this already? In Python would be even better.\n", "output": "You can try sqlparse. It's a Python module that provides simple SQL formatting. A online demo is available here.\n"}
{"input": "Python way to clone a git repository Is there a Python way without using a subprocess to clone a git repository? I'm up for using any sort of modules you recommend. \n", "output": "Using GitPython will give you a good python interface to Git.\nFor example, after installing it (pip install gitpython), for cloning a new repository you can use clone_from function:\nfrom git import Repo\n\nRepo.clone_from(git_url, repo_dir)\n\nNote: GitPython requires git being installed on the system, and accessible via system's PATH.\n"}
{"input": "What is a good way to do countif in Python I want to count how many members of an iterable meet a given condition. I'd like to do it in a way that is clear and simple and preferably reasonably optimal.\nMy current best ideas are:\nsum(meets_condition(x) for x in my_list)\n\nand\nlen([x for x in my_list if meets_condition(x)])\n\nThe first one being iterator based is presumably faster for big lists. And it's the same form as you'd use for testing any and all. However it depends on the fact that int(True) == 1, which is somewhat ugly.\nThe second one seems easier to read to me, but it is different from the any and all forms.\nDoes anyone have any better suggestions? is there a library function somewhere that I am missing?\n", "output": "The iterator based approach is just fine. There are some slight modifications that can emphasize the fact that you are counting:\nsum(1 if meets_condition(x) else 0 for x in my_list)\n# or \nsum(1 for x in my_list if meets_condition(x))\n\nAnd as always, if the intent isn't apparent from the code, encapsulate it in descriptively named function:\ndef count_matching(condition, seq):\n    \"\"\"Returns the amount of items in seq that return true from condition\"\"\"\n    return sum(1 for item in seq if condition(item))\n\ncount_matching(meets_condition, my_list)\n\n"}
{"input": "How to keep all my django applications in specific folder I have a Django project, let's say \"project1\". \nTypical folder structure for applications is:\n/project1/\n         /app1/\n         /app2/\n         ...\n         __init__.py\n         manage.py\n         settings.py\n         urls.py\n\nWhat should I do if I want to hold all of my applications in some separate folder, 'apps' for example? So that structure should look like the following:\n/project/\n         apps/\n              app1/\n              app2/\n              ...\n         __init__.py\n         manage.py\n         settings.py\n         urls.py\n\n", "output": "You can add your apps folder to your python path by inserting the following in your settings.py:\nimport os\nimport sys\n\nPROJECT_ROOT = os.path.dirname(__file__)\nsys.path.insert(0, os.path.join(PROJECT_ROOT, 'apps'))\n\nThen you can use all the apps in this folder just in the same way as they were in your project root!\n"}
{"input": "Import from sibling directory I have a Python class called \"ClassA\" and another Python class which is supposed to import ClassA which is \"ClassB\". The directory structure is as follows:\nMainDir\n../Dir\n..../DirA/ClassA\n..../DirB/ClassB\n\nHow would I use sys.path so that ClassB can use ClassA?\n", "output": "as a literal answer to the question 'Python Import from parent directory':\nto import 'mymodule' that is in the parent directory of your current module:\nimport os\nparentdir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))\nos.sys.path.insert(0,parentdir) \nimport mymodule\n\nedit\nUnfortunately, the __file__ attribute is not always set.\nA more secure way to get the parentdir is through the inspect module:\nimport inspect\ncurrentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))\nparentdir = os.path.dirname(currentdir)\n\n"}
{"input": "How can I read the contents of an URL with Python? The following works when I paste it on the browser: \nhttp://www.somesite.com/details.pl?urn=2344\n\nBut when I try reading the URL with Python nothing happens:\n link = 'http://www.somesite.com/details.pl?urn=2344'\n f = urllib.urlopen(link)           \n myfile = f.readline()  \n print myfile\n\nDo I need to encode the URL, or is there something I'm not seeing?\n", "output": "To answer your question:\nimport urllib\n\nlink = \"http://www.somesite.com/details.pl?urn=2344\"\nf = urllib.urlopen(link)\nmyfile = f.read()\nprint myfile\n\nYou need to read(), not readline()\nOr, just get this library here: http://docs.python-requests.org/en/latest/ and seriously use it :)\nimport requests\n\nlink = \"http://www.somesite.com/details.pl?urn=2344\"\nf = requests.get(link)\n\nprint f.text\n\n"}
{"input": "Pandas Combining 2 Data Frames (join on a common column) I have 2 dataframes:\nrestaurant_ids_dataframe \nData columns (total 13 columns):\nbusiness_id      4503  non-null values\ncategories       4503  non-null values\ncity             4503  non-null values\nfull_address     4503  non-null values\nlatitude         4503  non-null values\nlongitude        4503  non-null values\nname             4503  non-null values\nneighborhoods    4503  non-null values\nopen             4503  non-null values\nreview_count     4503  non-null values\nstars            4503  non-null values\nstate            4503  non-null values\ntype             4503  non-null values\ndtypes: bool(1), float64(3), int64(1), object(8)`\n\nand \nrestaurant_review_frame\nInt64Index: 158430 entries, 0 to 229905\nData columns (total 8 columns):\nbusiness_id    158430  non-null values\ndate           158430  non-null values\nreview_id      158430  non-null values\nstars          158430  non-null values\ntext           158430  non-null values\ntype           158430  non-null values\nuser_id        158430  non-null values\nvotes          158430  non-null values\ndtypes: int64(1), object(7)\n\nI would like to join these two DataFrames to make them into a single dataframe using the DataFrame.join() command in pandas.\nI have tried the following line of code:\n#the following line of code creates a left join of restaurant_ids_frame and   restaurant_review_frame on the column 'business_id'\nrestaurant_review_frame.join(other=restaurant_ids_dataframe,on='business_id',how='left')\n\nBut when I try this I get the following error:\nException: columns overlap: Index([business_id, stars, type], dtype=object)\n\nI am very new to pandas and have no clue what I am doing wrong as far as executing the join statement is concerned.\nany help would be much appreciated.\n", "output": "You can use merge to combine two dataframes into one:\nimport pandas as pd\npd.merge(restaurant_ids_dataframe, restaurant_review_frame, on='business_id', how='outer')\n\nwhere on specifies field name that exists in both dataframes to join on, and how\ndefines whether its inner/outer/left/right join, with outer using 'union of keys from both frames (SQL: full outer join).' Since you have 'star' column in both dataframes, this by default will create two columns star_x and star_y in the combined dataframe. As @DanAllan mentioned for the join method, you can modify the suffixes for merge by passing it as a kwarg. Default is suffixes=('_x', '_y'). if you wanted to do something like star_restaurant_id and star_restaurant_review, you can do:\n pd.merge(restaurant_ids_dataframe, restaurant_review_frame, on='business_id', how='outer', suffixes=('_restaurant_id', '_restaurant_review'))\n\nThe parameters are explained in detail in this link. \n"}
{"input": "Is it possible to generate and return a ZIP file with App Engine? I have a small project that would be perfect for Google App Engine.  Implementing it hinges on the ability to generate a ZIP file and return it.\nDue to the distributed nature of App Engine, from what I can tell, the ZIP file couldn't be created \"in-memory\" in the traditional sense.  It would basically have to be generated and and sent in a single request/response cycle.\nDoes the Python zip module even exist in the App Engine environment?\n", "output": "zipfile is available at appengine and reworked example follows:\nfrom contextlib import closing\nfrom zipfile import ZipFile, ZIP_DEFLATED\n\nfrom google.appengine.ext import webapp\nfrom google.appengine.api import urlfetch\n\ndef addResource(zfile, url, fname):\n    # get the contents      \n    contents = urlfetch.fetch(url).content\n    # write the contents to the zip file\n    zfile.writestr(fname, contents)\n\nclass OutZipfile(webapp.RequestHandler):\n    def get(self):\n        # Set up headers for browser to correctly recognize ZIP file\n        self.response.headers['Content-Type'] ='application/zip'\n        self.response.headers['Content-Disposition'] = \\\n            'attachment; filename=\"outfile.zip\"'    \n\n        # compress files and emit them directly to HTTP response stream\n        with closing(ZipFile(self.response.out, \"w\", ZIP_DEFLATED)) as outfile:\n            # repeat this for every URL that should be added to the zipfile\n            addResource(outfile, \n                'https://www.google.com/intl/en/policies/privacy/', \n                'privacy.html')\n            addResource(outfile, \n                'https://www.google.com/intl/en/policies/terms/', \n                'terms.html')\n\n"}
{"input": "How to create a UserProfile form in Django with first_name, last_name modifications? If think my question is pretty obvious and almost every developer working with UserProfile should be able to answer it.\nHowever, I could not find any help on the django documentation or in the Django Book.\nWhen you want to do a UserProfile form in with Django Forms, you'd like to modify the profile fields as well as some User field.\nBut there is no forms.UserProfileForm (yet?) !\nHow do you do that ?\n", "output": "Here is how I finally did : \nclass UserProfileForm(forms.ModelForm):\n    first_name = forms.CharField(label=_(u'Pr\u00c3\u00a9nom'), max_length=30)\n    last_name = forms.CharField(label=_(u'Nom'), max_length=30)\n\n    def __init__(self, *args, **kw):\n        super(UserProfileForm, self).__init__(*args, **kw)\n        self.fields['first_name'].initial = self.instance.user.first_name\n        self.fields['last_name'].initial = self.instance.user.last_name\n\n        self.fields.keyOrder = [\n            'first_name',\n            'last_name',\n            ...some_other...\n            ]\n\n    def save(self, *args, **kw):\n        super(UserProfileForm, self).save(*args, **kw)\n        self.instance.user.first_name = self.cleaned_data.get('first_name')\n        self.instance.user.last_name = self.cleaned_data.get('last_name')\n        self.instance.user.save()\n\n    class Meta:\n        model = UserProfile\n\n"}
{"input": "attribute 'tzinfo' of 'datetime.datetime' objects is not writable How do I set the timezone of a datetime instance that just came out of the datastore?\nWhen it first comes out it is in UTC.  I want to change it to EST.\nI'm trying, for example:\nclass Book( db.Model ):\n    creationTime = db.DateTimeProperty()\n\nWhen a Book is retrieved, I want to set its tzinfo immediately:\nbook.creationTime.tzinfo = EST\n\nWhere I use this example for my EST object\nHowever I get:\n\nattribute 'tzinfo' of 'datetime.datetime' objects is not writable\n\nI've seen a number of answers that recommend pytz and python-dateutil, but I really want an answer to this question.\n", "output": "datetime's objects are immutable, so you never change any of their attributes -- you make a new object with some attributes the same, and some different, and assign it to whatever you need to assign it to.\nI.e., in your case, instead of\nbook.creationTime.tzinfo = EST\n\nyou have to code\nbook.creationTime = book.creationTime.replace(tzinfo=EST)\n\n"}
{"input": "The \"next\" parameter, redirect, django.contrib.auth.login I'm trying to redirect users to custom url \"/gallery/(username)/\" after successfully logging in.  It currently redirects to the default \"/account/profile/\"  While I know what I can override the redirect url in my settings.py, my url is dynamic thus it will not work.\nDocumentation states that I need to use the \"next\" parameter and context processors.  I have the {{next}} in my template, but I'm confused on how to actually pass the \"/gallery/(username)\".  Any help would be greatly appreciated.\np.s I'm trying to steer away from writing my own login view.\n", "output": "Django's login view django.contrib.auth.views.login accepts a dictionary named extra_context. The values in the dictionary are directly passed to the template. So you can use that to set the next parameter. Once that is done, you can set a hidden field with name next and value {{ next }} so that it gets rendered in the template.\n"}
{"input": "python count days ignoring weekends how can I calculate number of days between two dates ignoring weekends ?\n", "output": ">>> from datetime import date,timedelta\n>>> fromdate = date(2010,1,1)\n>>> todate = date(2010,3,31)\n>>> daygenerator = (fromdate + timedelta(x + 1) for x in xrange((todate - fromdate).days))\n>>> sum(1 for day in daygenerator if day.weekday() < 5)\n63\n\nThis creates a generator using a generator expression which will yield the list of days to get from the fromdate to todate.  \nWe could then create a list from the generator, filtering out weekends using the weekday() function, and the size of the list gives the number of days we want.  However, to save having the whole list in memory which could be a problem if the dates are a long time apart we use another generator expression which filters out weekends but returns 1 instead of each date.  We can then just add all these 1s together to get the length without having to store the whole list.\nNote, if fromdate == todate this calculate 0 not 1.\n"}
{"input": "Using a regular expression to replace upper case repeated letters in python with a single lowercase letter I am trying to replace any instances of uppercase letters that repeat themselves twice in a string with a single instance of that letter in a lower case. I am using the following regular expression and it is able to match the repeated upper case letters, but I am unsure as how to make the letter that is being replaced lower case.\nimport re\ns = 'start TT end'\nre.sub(r'([A-Z]){2}', r\"\\1\", s)\n>>> 'start T end'\n\nHow can I make the \"\\1\" lower case? Should I not be using a regular expression to do this?\n", "output": "Pass a function as the repl argument. The MatchObject is passed to this function and .group(1) gives the first parenthesized subgroup:\nimport re\ns = 'start TT end'\ncallback = lambda pat: pat.group(1).lower()\nre.sub(r'([A-Z]){2}', callback, s)\n\nEDIT\nAnd yes, you should use ([A-Z])\\1 instead of ([A-Z]){2} in order to not match e.g. AZ. (See @bobince's answer.)\nimport re\ns = 'start TT end'\nre.sub(r'([A-Z])\\1', lambda pat: pat.group(1).lower(), s) # Inline\n\nGives:\n'start t end'\n\n"}
{"input": "How to create a python 2.x package - simple case Please show the simple and up to date standard way to create a python package for python 2.x\nI'd prefer to use pip for installing the package later.\nThe package should contain a single class:\nclass hello:\n  def greet(self):\n    print \"hello\"\n\nOne should be able to do the following later:\npip install my_package-0.1.1....\n\nAnd then using it:\nfrom my_package import hello\n\nh = hello.hello()\nh.greet()\n\nWhat I am asking for is:\n\nThe directory and file layout\nContents of the files \ncommand to create the distributable package file\ncommand to install the package from the distributable package file (using preferably pip)\n\nThere are several howtos that I found but I am still not sure how this very simple and stripped down case (no nested packages, removal off all files and features that can be omitted for the most simple case) would be handled and which is the modern way to do it.\nI would like this question to enter community wiki state, so you won't get any points and I will give enough time and will mark an answer accepted after several days, also considering the votes and comments.\nEdit:\nI have a first running example that I want to share, I used Marius Gedminas's answer for it. It does not contain everything that should be there, but it works, so it can demonstrate the core of the technical process. To add more necessary parts please read Marius's answer below. \nDirectory structure:\nMyProject/\n    setup.py\n    my_package.py\n    README.txt\n    MANIFEST.in\n\nsetup.py:\nfrom setuptools.import setup\nsetup(name='MyProject',\n      version='0.1',\n      py_modules=['my_package'])\n\nmy_package.py:\nclass hello:\n  def greet(self):\n    print \"hello\"\n\nMANIFEST.in:\ninclude *.txt\n\nTo create the package from this folder, go into the folder MyProject and run:\n$ python setup.py sdist\n\nThis will create a file MyProject-0.1.tar.gz in a subfolder dist/. Copy this file to a folder on the target machine.\nOn the target machine run this command in the folder containing MyProject-0.1.tar.gz:\nsudo pip install MyProject-0.1.tar.gz\n\nIt can be necessary to logout and re-login on the target machine now, so the package will be found. Afterwards you can test the package on the target machine using the python shell:\n$ python\n>>> import my_package\n>>> h = my_package.hello()\n>>> h.greet()\nhello\n>>> \n\nOnce this works please remember to add the other necessary contents, see Marius's answer below.\n", "output": "Start simple\nSimplest one-file package:\nMyProject/\n    setup.py\n    my_package.py\n\nSimplest setup.py:\nfrom setuptools import setup\nsetup(name='MyProject',\n      version='0.1',\n      author='Your Name',\n      author_email='your.name@example.com',\n      license='MIT',\n      description='Example package that says hello',\n      py_modules=['my_package'])\n\nIncluding extra files in the package\nNext you should probably add a README:\nMyProject/\n    MANIFEST.in\n    README.rst\n    setup.py\n    my_package.py\n\nNote the new file -- MANIFEST.in.  It specifies which non-Python files ought to be included in your source distribution:\ninclude *.rst\n\nPeople will tell you \"oh, skip the manifest, just add the files to source control, setuptools will find them\".  Ignore that advice, it's too error-prone.\nMaking the PyPI page useful\nIt's useful to make the README.rst available for people to view online, on the Python Package Index.  So change your setup.py to do\nfrom setuptools import setup\nwith open('README.rst') as f:\n    readme = f.read()\nsetup(name='MyProject',\n      ...\n      description='Example package that says hello',\n      long_description=readme,\n      ...)\n\nUse ReStructuredText markup for prettier pages.  Use\npython setup.py --long-description | rst2html\n\nto catch ReStructuredText errors early.\nMore than one Python module in a package\nOne file will not be enough soon, so change it to a package (confusing terminology warning: Python package as in a directory with a __init__ py, not as in a distributable self-contained archive):\nMyProject/\n    MANIFEST.in\n    README.rst\n    setup.py\n    my_package/\n        __init__.py\n        some_module.py\n\nand change setup.py to\nfrom setuptools import setup, find_packages\nwith open('README.rst') as f:\n    readme = f.read()\nsetup(name='MyProject',\n      version='0.2',\n      author='Your Name',\n      author_email='your@email',\n      license='MIT',\n      description='Example package that says hello',\n      long_description=readme,\n      packages=find_packages())\n\nReleasing to the public\nGet a PyPI account -- you only need to do this once.\nTo make a release, make sure the version number in setup.py is correct, then run\npython setup.py sdist register upload\n\nThat's it.\nTelling people to install it\nTell them to\npip install MyProject\n\n(same name you specified in setup.py as the name argument to setup())\n"}
{"input": "How do you use subprocess.check_output() in Python? I have found documentation about subprocess.check_output() but I cannot find one with  arguments and the documentation is not very in depth. I am using Python 3 (but am trying to run a Python 2 file through Python 3)\nI am trying to run this command:\npython py2.py -i test.txt\n-i is a positional argument for argparse, test.txt is what the -i is, py2.py is the file to run\nI have tried a lot of (non working) variations including:\npy2output = subprocess.check_output([str('python py2.py '),'-i', 'test.txt'])\npy2output = subprocess.check_output([str('python'),'py2.py','-i', test.txt'])\n", "output": "The right answer is:\npy2output = subprocess.check_output(['python','py2.py','-i', 'test.txt'])\n\nTo demonstrate, here are my two programs:\npy2.py:\nimport sys\nprint sys.argv\n\npy3.py:\nimport subprocess\npy2output = subprocess.check_output(['python', 'py2.py', '-i', 'test.txt'])\nprint('py2 said:', py2output)\n\nRunning it:\n$ python3 py3.py\npy2 said: b\"['py2.py', '-i', 'test.txt']\\n\"\n\nHere's what's wrong with each of your versions:\npy2output = subprocess.check_output([str('python py2.py '),'-i', 'test.txt'])\n\nFirst, str('python py2.py') is exactly the same thing as 'python py2.py'\u00e2\u0080\u0094you're taking a str, and calling str to convert it to an str. This makes the code harder to read, longer, and even slower, without adding any benefit.\nMore seriously, python py2.py can't be a single argument, unless you're actually trying to run a program named, say, /usr/bin/python\\ py2.py. Which you're not; you're trying to run, say, /usr/bin/python with first argument py2.py. So, you need to make them separate elements in the list.\nYour second version fixes that, but you're missing the ' before test.txt'. This should give you a SyntaxError, probably saying EOL while scanning string literal.\nMeanwhile, I'm not sure how you found documentation but couldn't find any examples with arguments. The very first example is:\n>>> subprocess.check_output([\"echo\", \"Hello World!\"])\nb'Hello World!\\n'\n\nThat calls the \"echo\" command with an additional argument, \"Hello World!\".\nAlso:\n\n-i is a positional argument for argparse, test.txt is what the -i is\n\nI'm pretty sure -i is not a positional argument, but an optional argument. Otherwise, the second half of the sentence makes no sense.\n"}
{"input": "How to write to an existing excel file without overwriting data (using pandas)? I use pandas to write to excel file in the following fashion:\nimport pandas\n\nwriter = pandas.ExcelWriter('Masterfile.xlsx') \n\ndata_filtered.to_excel(writer, \"Main\", cols=['Diff1', 'Diff2'])\n\nwriter.save()\n\nMasterfile.xlsx already consists of number of different tabs.\nPandas correctly writes to \"Main\" sheet, unfortunately it also deletes all other tabs.\n", "output": "Pandas docs says it uses openpyxl for xlsx files. Quick look through the code in ExcelWriter gives a clue that something like this might work out:\nimport pandas\nfrom openpyxl import load_workbook\n\nbook = load_workbook('Masterfile.xlsx')\nwriter = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') \nwriter.book = book\nwriter.sheets = dict((ws.title, ws) for ws in book.worksheets)\n\ndata_filtered.to_excel(writer, \"Main\", cols=['Diff1', 'Diff2'])\n\nwriter.save()\n\n"}
{"input": "lxml runtime error: Reason: Incompatible library version: etree.so requires version 12.0.0 or later, but libxml2.2.dylib provides version 10.0.0 I have a perplexing problem.\nI have used mac version 10.9, anaconda 3.4.1, python 2.7.6.\nDeveloping web application with python-amazon-product-api. i have overcome an obstacle about installing lxml, referencing clang error: unknown argument: '-mno-fused-madd' (python package installation failure).\nbut another runtime error happened.\nHere is the output from webbrowser.\nException Type: ImportError\nException Value:    \ndlopen(/Users/User_Name/Documents/App_Name/lib/python2.7/site-packages/lxml/etree.so, 2): Library not loaded: libxml2.2.dylib\nReferenced from: /Users/User_Name/Documents/App_Name/lib/python2.7/site-packages/lxml/etree.so\nReason: Incompatible library version: etree.so requires version 12.0.0 or later, but libxml2.2.dylib provides version 10.0.0\n\nNot sure how to proceed and have searched here and elsewhere for this particular error. Any help is much appreciated!\n", "output": "This worked for me:\nbrew install libxml2\nbrew install libxslt\nbrew link libxml2 --force\nbrew link libxslt --force\n\n"}
{"input": "Are these two python statements the same? I have these two statements\nreturn self.getData() if self.getData() else ''\n\nand\nreturn self.getData() or ''\n\nI want to know are they same or there is any difference\n", "output": "I would say No because if self.getData() changes something during its operation, then the first statement has the possibility of returning a different result since it will make a 2nd call to it.\n"}
{"input": "How to convert XML to objects? I need to load an XML file and put the contents into an object-oriented structure. I want to take this:\n<main>\n    <object1 attr=\"name\">content</object>\n</main>\n\nand turn it into something like this:\nmain\nmain.object1 = \"content\"\nmain.object1.attr = \"name\"\n\nIt'll have a more complicated structure than that and I can't hard code the element names. The names need to be collected when parsing and used as the object properties.\nAny thoughts on what would be the best way to go about this?\n", "output": "It's worth to have a look at http://lxml.de/objectify.html\n>>> xml = \"\"\"<main>\n... <object1 attr=\"name\">content</object1>\n... <object1 attr=\"foo\">contenbar</object1>\n... <test>me</test>\n... </main>\"\"\"\n\n>>> from lxml import objectify\n\n>>> main = objectify.fromstring(xml)\n\n>>> main.object1[0]\n'content'\n\n>>> main.object1[1]\n'contenbar'\n\n>>> main.object1[0].get(\"attr\")\n'name'\n\n>>> main.test\n'me'\n\nOr the other way around to build xml structures:\n>>> item = objectify.Element(\"item\")\n\n>>> item.title = \"Best of python\"\n\n>>> item.price = 17.98\n\n>>> item.price.set(\"currency\", \"EUR\")\n\n>>> order = objectify.Element(\"order\")\n\n>>> order.append(item)\n\n>>> order.item.quantity = 3\n\n>>> order.price = sum(item.price * item.quantity\n... for item in order.item)\n\n>>> import lxml.etree\n\n>>> print lxml.etree.tostring(order, pretty_print=True)\n<order>\n  <item>\n    <title>Best of python</title>\n    <price currency=\"EUR\">17.98</price>\n    <quantity>3</quantity>\n  </item>\n  <price>53.94</price>\n</order>\n\n"}
{"input": "Do dicts preserve iteration order if they are not modified? If I have a dictionary in Python, and I iterate through it once, and then again later, is the iteration order guaranteed to be preserved given that I didn't insert, delete, or update any items in the dictionary? (But I might have done look-ups).\n", "output": "Here is what dict.items() documentation says:\n\ndict.items() return a copy of the dictionary\u00e2\u0080\u0099s list of (key, value) pairs.\nIf items(), keys(), values(), iteritems(), iterkeys(), and itervalues() are called with no intervening modifications to the dictionary, the lists will directly correspond.\n\nI think it's reasonable to assume that item ordering won't change if all you do is iteration.\n"}
{"input": "How to catch an exception in python and get a reference to the exception, WITHOUT knowing the type? I'm wondering how I can catch any raised object (i.e. a type that does not extend Exception), and still get a reference to it.\nI came across the desire to do this when using Jython.   When calling a Java method, if that method raises an exception, it will not extend Python's Exception class, so a block like this will not catch it:\ntry:\n    # some call to a java lib that raises an exception here\nexcept Exception, e:\n    # will never be entered\n\nI can do this, but then I have no access to the exception object that was raised.\ntry:\n    # some call to a java lib that raises an exception here\nexcept:\n    # will enter here, but there's no reference to the exception that was raised\n\nI can solve this by importing the Java exception type and catching it explicitly, but this makes it difficult/impossible to write generic exception handling wrappers/decorators.\nIs there a way to catch some arbitrary exception and still get a reference to it in the except block?\nI should note that I'm hoping for the exception handling decorator I am making to be usable with Python projects, not just with Jython projects.  I'd like to avoid importing java.lang.Exception because that just makes it Jython-only.  For example, I figure I can do something like this (but I haven't tried it), but I'd like to avoid it if I can.\ntry:\n    # some function that may be running jython and may raise a java exception\nexcept (Exception, java.lang.Exception), e:\n    # I imagine this would work, but it makes the code jython-only\n\n", "output": "You can reference exceptions using the sys module. sys.exc_info is a tuple of the type, the instance and the traceback.\nimport sys\n\ntry:\n    # some call to a java lib that raises an exception here\nexcept:\n    instance = sys.exc_info()[1]\n\n"}
{"input": "Efficiently generate a 16-character, alphanumeric string I'm looking for a very quick way to generate an alphanumeric unique id for a primary key in a table.\nWould something like this work?\ndef genKey():\n    hash = hashlib.md5(RANDOM_NUMBER).digest().encode(\"base64\")\n    alnum_hash = re.sub(r'[^a-zA-Z0-9]', \"\", hash)\n    return alnum_hash[:16]\n\nWhat would be a good way to generate random numbers?\nIf I base it on microtime, I have to account for the possibility of several calls of genKey() at the same time from different instances.\nOr is there a better way to do all this?\nThanks.\n", "output": "You can use this:\n>>> import random\n>>> ''.join(random.choice('0123456789ABCDEF') for i in range(16))\n'E2C6B2E19E4A7777'\n\nThere is no guarantee that the keys generated will be unique so you should be ready to retry with a new key in the case the original insert fails. Also, you might want to consider using a deterministic algorithm to generate a string from an auto-incremented id instead of using random values, as this will guarantee you uniqueness (but it will also give predictable keys).\n"}
{"input": "How to replace unicode characters by ascii characters in Python (perl script given)? I am trying to learn python and couldn't figure out how to translate the following perl script to python:\n#!/usr/bin/perl -w                     \n\nuse open qw(:std :utf8);\n\nwhile(<>) {\n  s/\\x{00E4}/ae/;\n  s/\\x{00F6}/oe/;\n  s/\\x{00FC}/ue/;\n  print;\n}\n\nThe script just changes unicode umlauts to alternative ascii output. (So the complete output is in ascii.) I would be grateful for any hints. Thanks! \n", "output": "For converting to ASCII you might want to try ASCII, Dammit or this recipe, which boils down to:\n>>> title = u\"Kl\u00c3\u00bcft skr\u00c3\u00a4ms inf\u00c3\u00b6r p\u00c3\u00a5 f\u00c3\u00a9d\u00c3\u00a9ral \u00c3\u00a9lectoral gro\u00c3\u009fe\"\n>>> import unicodedata\n>>> unicodedata.normalize('NFKD', title).encode('ascii','ignore')\n'Kluft skrams infor pa federal electoral groe'\n\n"}
{"input": "How to call a static methods on a django model class during a south migration I'm writing a data migration in south to fix some denormalized data I screwed up in earlier code.  The way to figure out the right value for the incorrect field is to call a static method on the django model class.  The code looks like this:\nclass Account(models.Model):\n    name = models.CharField()\n\n    @staticmethod\n    def lookup_by_name(name):\n        # There's actually more to it than this\n        return Account.objects.get(name=name)\n\nclass Record(models.Model):\n    account_name = models.CharField()\n    acct = models.ForeignKey('Account')\n\n...\nclass Migration(DataMigration):\n\n    def forwards(self, orm):\n        # Fixing Records with the wrong FK to Account\n        for record in orm.Record.objects.all():\n            record.acct = orm.Account.lookup_by_name(record.account_name)\n            record.save()\n\nBut this fails with\nAttributeError: type object 'Account' has no attribute 'lookup_by_name'\nI'm guessing south just doesn't support @staticmethods on model classes?\nTrying to import Account directly fails, unless I also import Record directly and completely ignore the ORM object.  Is that a safe option, since it's a data migration and the schema isn't changing?  Or should I just run this fix by hand rather than in the context of a south migration.\n", "output": "You can't use methods from models.py in south migrations. The reason is that in the future models.py will evolve and sooner or later you will delete those methods, then migration will be broken.\nYou should put all code needed by migration in migration file itself.\n"}
{"input": "Python: Easily access deeply nested dict (get and set) I'm building some Python code to read and manipulate deeply nested dicts (ultimately for interacting with JSON services, however it would be great to have for other purposes)  I'm looking for a way to easily read/set/update values deep within the dict, without needing a lot of code.  \n@see also http://stackoverflow.com/questions/3031219/python-recursively-access-dict-via-attributes-as-well-as-index-access -- Curt Hagenlocher's \"DotDictify\" solution is pretty eloquent.  I also like what Ben Alman presents for JavaScript in http://benalman.com/projects/jquery-getobject-plugin/  It would be great to somehow combine the two.\nBuilding off of Curt Hagenlocher and Ben Alman's examples, it would be great in Python to have a capability like:\n>>> my_obj = DotDictify()\n>>> my_obj.a.b.c = {'d':1, 'e':2}\n>>> print my_obj\n{'a': {'b': {'c': {'d': 1, 'e': 2}}}}\n>>> print my_obj.a.b.c.d\n1\n>>> print my_obj.a.b.c.x\nNone\n>>> print my_obj.a.b.c.d.x\nNone\n>>> print my_obj.a.b.c.d.x.y.z\nNone\n\nAny idea if this is possible, and if so, how to go about modifying the DotDictify solution?\nAlternatively, the get method could be made to accept a dot notation (and a complementary set method added) however the object notation sure is cleaner.\n>>> my_obj = DotDictify()\n>>> my_obj.set('a.b.c', {'d':1, 'e':2})\n>>> print my_obj\n{'a': {'b': {'c': {'d': 1, 'e': 2}}}}\n>>> print my_obj.get('a.b.c.d')\n1\n>>> print my_obj.get('a.b.c.x')\nNone\n>>> print my_obj.get('a.b.c.d.x')\nNone\n>>> print my_obj.get('a.b.c.d.x.y.z')\nNone\n\nThis type of interaction would be great to have for dealing with deeply nested dicts.  Does anybody know another strategy (or sample code snippet/library) to try?\n", "output": "Attribute Tree\nThe problem with your first specification is that Python can't tell in __getitem__ if, at my_obj.a.b.c.d, you will next proceed farther down a nonexistent tree, in which case it needs to return an object with a __getitem__ method so you won't get an AttributeError thrown at you, or if you want a value, in which case it needs to return None.\nI would argue that in every case you have above, you should expect it to throw a KeyError instead of returning None.  The reason being that you can't tell if None means \"no key\" or \"someone actually stored None at that location\".  For this behavior, all you have to do is take dotdictify, remove marker, and replace __getitem__ with:\ndef __getitem__(self, key):\n    return self[key]\n\nBecause what you really want is a dict with __getattr__ and __setattr__.\nThere may be a way to remove __getitem__ entirely and say something like __getattr__ = dict.__getitem__, but I think this may be over-optimization, and will be a problem if you later decide you want __getitem__ to create the tree as it goes like dotdictify originally does, in which case you would change it to:\ndef __getitem__(self, key):\n    if key not in self:\n        dict.__setitem__(self, key, dotdictify())\n    return dict.__getitem__(self, key)\n\nI don't like the marker business in the original dotdictify.\nPath Support\nThe second specification (override get() and set()) is that a normal dict has a get() that operates differently from what you describe and doesn't even have a set (though it has a setdefault() which is an inverse operation to get()).  People expect get to take two parameters, the second being a default if the key isn't found.\nIf you want to extend __getitem__ and __setitem__ to handle dotted-key notation, you'll need to modify doctictify to:\nclass dotdictify(dict):\n    def __init__(self, value=None):\n        if value is None:\n            pass\n        elif isinstance(value, dict):\n            for key in value:\n                self.__setitem__(key, value[key])\n        else:\n            raise TypeError, 'expected dict'\n\n    def __setitem__(self, key, value):\n        if '.' in key:\n            myKey, restOfKey = key.split('.', 1)\n            target = self.setdefault(myKey, dotdictify())\n            if not isinstance(target, dotdictify):\n                raise KeyError, 'cannot set \"%s\" in \"%s\" (%s)' % (restOfKey, myKey, repr(target))\n            target[restOfKey] = value\n        else:\n            if isinstance(value, dict) and not isinstance(value, dotdictify):\n                value = dotdictify(value)\n            dict.__setitem__(self, key, value)\n\n    def __getitem__(self, key):\n        if '.' not in key:\n            return dict.__getitem__(self, key)\n        myKey, restOfKey = key.split('.', 1)\n        target = dict.__getitem__(self, myKey)\n        if not isinstance(target, dotdictify):\n            raise KeyError, 'cannot get \"%s\" in \"%s\" (%s)' % (restOfKey, myKey, repr(target))\n        return target[restOfKey]\n\n    def __contains__(self, key):\n        if '.' not in key:\n            return dict.__contains__(self, key)\n        myKey, restOfKey = key.split('.', 1)\n        target = dict.__getitem__(self, myKey)\n        if not isinstance(target, dotdictify):\n            return False\n        return restOfKey in target\n\n    def setdefault(self, key, default):\n        if key not in self:\n            self[key] = default\n        return self[key]\n\n    __setattr__ = __setitem__\n    __getattr__ = __getitem__\n\nTest code:\n>>> life = dotdictify({'bigBang': {'stars': {'planets': {}}}})\n>>> life.bigBang.stars.planets\n{}\n>>> life.bigBang.stars.planets.earth = { 'singleCellLife' : {} }\n>>> life.bigBang.stars.planets\n{'earth': {'singleCellLife': {}}}\n>>> life['bigBang.stars.planets.mars.landers.vikings'] = 2\n>>> life.bigBang.stars.planets.mars.landers.vikings\n2\n>>> 'landers.vikings' in life.bigBang.stars.planets.mars\nTrue\n>>> life.get('bigBang.stars.planets.mars.landers.spirit', True)\nTrue\n>>> life.setdefault('bigBang.stars.planets.mars.landers.opportunity', True)\nTrue\n>>> 'landers.opportunity' in life.bigBang.stars.planets.mars\nTrue\n>>> life.bigBang.stars.planets.mars\n{'landers': {'opportunity': True, 'vikings': 2}}\n\n"}
{"input": "Is there a way in Python to return value via output parameter? Is there a way in Python to return a value via output parameter?\n", "output": "There is no reason to, since Python can return multiple values:\ndef func():\n    return 1,2,3\n\na,b,c = func()\n\nBut you can also pass a mutable parameter, and return values as well:\ndef func(a):\n    a.append(1)\n    a.append(2)\n    a.append(3)\n\nL=[]\nfunc(L)\nprint L\n\n"}
{"input": "Extract files from zip without keeping the structure using python ZipFile? I try to extract all files from .zip containing subfolders in one folder.  I want all the files from subfolders extract in only one folder without keeping the original structure. At the moment, I extract all, move the files to a folder, then remove previous subfolders.  The files with same names are overwrited.\nIs it possible to do it before writing files?\nHere is a structure for example:\nmy_zip/file1.txt\nmy_zip/dir1/file2.txt\nmy_zip/dir1/dir2/file3.txt\nmy_zip/dir3/file4.txt\n\nAt the end I whish this:\nmy_dir/file1.txt\nmy_dir/file2.txt\nmy_dir/file3.txt\nmy_dir/file4.txt\n\nWhat can I add to this code ?\nimport zipfile\nmy_dir = \"D:\\\\Download\\\\\"\nmy_zip = \"D:\\\\Download\\\\my_file.zip\"\n\nzip_file = zipfile.ZipFile(my_zip, 'r')\nfor files in zip_file.namelist():\n    zip_file.extract(files, my_dir)\nzip_file.close()\n\nif I rename files path from zip_file.namelist(), I have this error:\nKeyError: \"There is no item named 'file2.txt' in the archive\"\n\n", "output": "This opens file handles of members of the zip archive, extracts the filename and copies it to a target file (that's how ZipFile.extract works, without taken care of subdirectories).\nimport os\nimport shutil\nimport zipfile\n\nmy_dir = r\"D:\\Download\"\nmy_zip = r\"D:\\Download\\my_file.zip\"\n\nwith zipfile.ZipFile(my_zip) as zip_file:\n    for member in zip_file.namelist():\n        filename = os.path.basename(member)\n        # skip directories\n        if not filename:\n            continue\n\n        # copy file (taken from zipfile's extract)\n        source = zip_file.open(member)\n        target = file(os.path.join(my_dir, filename), \"wb\")\n        with source, target:\n            shutil.copyfileobj(source, target)\n\n"}
{"input": "Python CSV DictReader with UTF-8 data AFAIK, the Python (v2.6) csv module can't handle unicode data by default, correct? In the Python docs there's an example on how to read from a UTF-8 encoded file. But this example only returns the CSV rows as a list.\nI'd like to access the row columns by name as it is done by csv.DictReader but with UTF-8 encoded CSV input file.\nCan anyone tell me how to do this in an efficient way? I will have to process CSV files in 100's of MByte in size.\n", "output": "Actually, I came up with an answer myself (sorry for replying to my own question):\ndef UnicodeDictReader(utf8_data, **kwargs):\n    csv_reader = csv.DictReader(utf8_data, **kwargs)\n    for row in csv_reader:\n        yield {key: unicode(value, 'utf-8') for key, value in row.iteritems()}\n\n"}
{"input": "In Python how will you multiply individual elements of an array with a floating point or integer number? S=[22 33 45.6 21.6 51.8]\nP=2.45\n\nHere S is an array\nHow will I multiply this and get the value?\nSP=[53.9 80.85 111.72 52.92 126.91]\n\n", "output": "In numpy it is quite simple\nimport numpy as np\nP=2.45\nS=[22, 33, 45.6, 21.6, 51.8]\nSP = P*np.array(S)\n\nI recommend taking a look at the numpy tutorial for an explanation of the full capabilities of numpy's arrays:\nhttp://www.scipy.org/Tentative_NumPy_Tutorial\n"}
{"input": "Scala equivalent to Python returning multiple items In Python it's possible to do something like this:\ndef blarg():\n    return \"blargidy\", \"blarg\"\n\ni, j = blargh()\n\nIs there something similar available in scala? \n", "output": "You can return a tuple:\ndef blarg = (\"blargidy\", \"blarg\")\n\nval (i, j) = blarg\n\nNote the pattern-matching syntax for parallel variable assignment: this works for any pattern, not just for tuples. So for instance:\nval list = 1 :: 2 :: 3 :: Nil\n\nval x :: y = list // x = 1 and y = 2 :: 3 :: Nil\n\n"}
{"input": "Custom tab completion in python argparse Is it possible to get smarter tab completion cooperating with argparse in python scripts?  My shell is bash.  The information I can find about this stuff online is over a year old and mostly relates to optparse, which I don't want to use (deprecated).  \nFor the following script:\n#!/usr/bin/env python\n\nimport argparse as ap\n\ndef main(**args):\n    pass\n\nif __name__ == '__main__':\n    parser = ap.ArgumentParser()\n    parser.add_argument('positional', choices=['spam', 'eggs'])\n    parser.add_argument('--optional', choices=['foo1', 'foo2', 'bar'])\n    args = parser.parse_args()\n    main(**vars(args))\n\nAfter chmod +x on the file, my expected usage would be something like:\n./example.py sp<tab>             \n   ->  completes to ./example.py spam\n./example.py --op<tab> \n   ->  completes to ./example.py --optional\n./example.py --optional b<tab>\n   ->  completes to ./example.py --optional bar \n./example.py --optional f<tab>   \n   ->  completes to ./example.py --optional foo \n       and further, prints  foo1  foo2  on stdout\n\nI have no idea where to even start attacking on this ... \n", "output": "Have a look at argcomplete by Andrey Kislyuk.\nInstall it with:\nsudo pip install argcomplete\n\nImport the module and add one line in your source before calling parser.parse_args():\n#!/usr/bin/env python\n\nimport argparse as ap\nimport argcomplete\n\ndef main(**args):\n  pass\n\nif __name__ == '__main__':\n  parser = ap.ArgumentParser()\n  parser.add_argument('positional', choices=['spam', 'eggs'])\n  parser.add_argument('--optional', choices=['foo1', 'foo2', 'bar'])\n  argcomplete.autocomplete(parser)\n  args = parser.parse_args()\n  main(**vars(args))\n\nand to make sure that bash knows about this script, you use\neval \"$(register-python-argcomplete your_script)\"\n\nyou should put that line in your ~/.bashrc or follow argcomplete's docs and activate 'global' completion.\nAfter that you completion works as requested. \nThe way this work is that the eval line creates a function _python_argcomlete which is registered using complete. (Run register-python-argcomplete your_script to just have a look at what gets eval-ed into bash). The autocomplete function looks for environment variables set by the bash completion mechanism to see if it needs to act. If it acts, it exits the program. If it doesn't act, this is a normal call to the program that function does nothing and the normal flow of the program continues.\n"}
{"input": "View pdf image in an iPython Notebook The following code allows me to view a png image in an iPython notebook. Is there a way to view pdf image? I don't need to use IPython.display necessarily. I am looking for a way to print a pdf image in a file to the iPython notebook output cell.\n## This is for an `png` image\nfrom IPython.display import Image\n\nfig = Image(filename=('./temp/my_plot.png'))\nfig\n\nThank you.\n", "output": "The problem you (and others) face is that PDFs cannot be displayed directly in the browser.\nThe only possible way to get something similar is to use an image-converter to create a PNG or JPG out of the PDF and display this one.\nThis could be done via imagemagick and a custom display function.\nUpdate 1\nA simple solution is to use wand (http://docs.wand-py.org) a python-imagemagick binding. I tried with Ubuntu 13.04:\n\nIn text form:\nfrom wand.image import Image as WImage\nimg = WImage(filename='hat.pdf')\nimg\n\nFor a multi-page pdf, you can get e.g. the second page via:\nimg = WImage(filename='hat.pdf[1]')\n\nUpdate 2\nAs recent browsers support to display pdfs with their embedded pdf viewer a possible alternative solution based on an iframe can be implemented as\nclass PDF(object):\n  def __init__(self, pdf, size=(200,200)):\n    self.pdf = pdf\n    self.size = size\n\n  def _repr_html_(self):\n    return '<iframe src={0} width={1[0]} height={1[1]}></iframe>'.format(self.pdf, self.size)\n\n  def _repr_latex_(self):\n    return r'\\includegraphics[width=1.0\\textwidth]{{{0}}}'.format(self.pdf)\n\nThis class implements html and latex representations, hence the pdf will also survive a nbconversion to latex. It can be used like\nPDF('hat.pdf',size=(300,250))\n\nWith Firefox 33 this results in \n\n"}
{"input": "FastCGI C++ vs. A Script Language (PHP/Python/Perl) What are the ups and downs of using FastCGI C++  vs.  PHP/Python/Perl to do the same job. \nAny performance or design pitfalls or using one over the other? Even your opinions are welcome. (Tell me why one or the other rocks, or one or the other sucks). \n", "output": "scripting languages may be slower than C, but is this a problem? almost never. and if the performance becomes a problem, you start to translate only the critical parts.\ntwitter/ruby is a good example; ruby is slow. some of the language features (that make ruby nice in the first place) just prevent different kinds of optimization (there is a great article by the jruby guy about this ... was it ola bini? can't remember).\nstill, twitter is powered by ruby, because ruby is fast enough. not long ago, \"the blogs\" reported twitter migrating to scala for performance reasons ... the truth was, only the messaging queue (and other parts of the backend) moved to scala. yahoo runs on a mixture of languages; php for the frontend, other, faster languages are used where performance is critical.\nso, why is performance not that important? there are several reasons:\n\ndatabase bottleneck: not the scripting is slow, the database is\nclientside bottleneck: rendering in the browser takes longer than the request. optimize the server side, and nobody will notice\nhorizontal scaling: often it's cheaper to add another server and thus triple the requests/sec than to optimize the app\ndeveloper time and maintenance are the most expensive parts of your project. you'll get more cheap python devs that maintain your app than web-enabled c-coders in less time\nno compiling, short dev cycles\n\nanother pro-scripting point: many of the scripting languages support inlining or inclusion of fast (C) code:\n\npython, inline c\nphp: extensions in c\nserver-side javascript via rhino: direct access to java/jvm (a good example for this is orf.at, one of the biggest websites in austria, powered by helma - serverside jvm-interpreted javascript!)\n\ni think, especially in web developement the pros of high-level scripting far outweight the cons.\n"}
{"input": "Python-daemon doesn't kill its kids When using python-daemon, I'm creating subprocesses likeso:\nimport multiprocessing\n\nclass Worker(multiprocessing.Process):\n   def __init__(self, queue):\n      self.queue = queue # we wait for things from this in Worker.run()\n\n   ...\n\nq = multiprocessing.Queue()\n\nwith daemon.DaemonContext():\n    for i in xrange(3):\n       Worker(q)\n\n    while True: # let the Workers do their thing\n       q.put(_something_we_wait_for())\n\nWhen I kill the parent daemonic process (i.e. not a Worker) with a Ctrl-C or SIGTERM, etc., the children don't die. How does one kill the kids?\nMy first thought is to use atexit to kill all the workers, likeso:\n with daemon.DaemonContext():\n    workers = list()\n    for i in xrange(3):\n       workers.append(Worker(q))\n\n    @atexit.register\n    def kill_the_children():\n        for w in workers:\n            w.terminate()\n\n    while True: # let the Workers do their thing\n       q.put(_something_we_wait_for())\n\nHowever, the children of daemons are tricky things to handle, and I'd be obliged for thoughts and input on how this ought to be done.\nThank you.\n", "output": "Your options are a bit limited. If doing self.daemon = True in the constructor for the Worker class does not solve your problem and trying to catch signals in the Parent (ie, SIGTERM, SIGINT) doesn't work, you may have to try the opposite solution - instead of having the parent kill the children, you can have the children commit suicide when the parent dies.\nThe first step is to give the constructor to Worker the PID of the parent process (you can do this with os.getpid()). Then, instead of just doing self.queue.get() in the worker loop, do something like this:\nwaiting = True\nwhile waiting:\n    # see if Parent is at home\n    if os.getppid() != self.parentPID:\n        # woe is me! My Parent has died!\n        sys.exit() # or whatever you want to do to quit the Worker process\n    try:\n        # I picked the timeout randomly; use what works\n        data = self.queue.get(block=False, timeout=0.1)\n        waiting = False\n    except queue.Queue.Empty:\n        continue # try again\n# now do stuff with data\n\nThe solution above checks to see if the parent PID is different than what it originally was (that is, if the child process was adopted by init or lauchd because the parent died) - see reference. However, if that doesn't work for some reason you can replace it with the following function (adapted from here):\ndef parentIsAlive(self):\n    try:\n        # try to call Parent\n        os.kill(self.parentPID, 0)\n    except OSError:\n        # *beeep* oh no! The phone's disconnected!\n        return False\n    else:\n        # *ring* Hi mom!\n        return True\n\nNow, when the Parent dies (for whatever reason), the child Workers will spontaneously drop like flies - just as you wanted, you daemon! :-D\n"}
{"input": "Erase whole array Python How do I erase a whole array, leaving it with no items?\nI want to do this so I can store new values in it (a new set of 100 floats) and find the minimum.\nRight now my program is reading the minimum from sets before I think because it is appending itself with the previous set still in there.  I use .append by the way.\n", "output": "Note that list and array are different classes.  You can do:\ndel mylist[:]\n\nThis will actually modify your existing list.  David's answer creates a new list and assigns it to the same variable.  Which you want depends on the situation (e.g. does any other variable have a reference to the same list?).\nTry:\na = [1,2]\nb = a\na = []\n\nand\na = [1,2]\nb = a\ndel a[:]\n\nPrint a and b each time to see the difference.\n"}
{"input": "python, format string I am trying to build a format string with lazy argument, eg I need smth like:\n\"%s \\%s %s\" % ('foo', 'bar') # \"foo %s bar\"\n\nhow can i do this?\n", "output": "\"%s %%s %s\" % ('foo', 'bar')\n\nyou need %%\n"}
{"input": "format ints into string of hex I need to create a string of hex digits from a list of random integers (0-255). Each hex digit should be represented by two characters: 5 - \"05\", 16 - \"10\"...\ne.g.\n\ninput: [0,1,2,3,127,200,255], \noutput: 000102037fc8ff\n\nI've managed to come up with:\n#!/usr/bin/env python\n\ndef format_me(nums):\n    result = \"\"\n    for i in nums:\n        if i <= 9:\n            result += \"0%x\" % i\n        else:\n            result += \"%x\" % i\n    return result\n\nprint format_me([0,1,2,3,127,200,255])\n\nLooks a bit awkward. Is there a simpler way?\n", "output": "Just for completeness, using the modern .format() syntax:\n>>> numbers = [1, 15, 255]\n>>> ''.join('{:02X}'.format(a) for a in numbers)\n'010FFF'\n\n"}
{"input": "switching keys and values in a dictionary in python Say I have a dictionary like so:\nmy_dict = {2:3, 5:6, 8:9}\n\nIs there a way that I can switch the keys and values to get:\n{3:2, 6:5, 9:8}\n\n", "output": "my_dict2 = dict((y,x) for x,y in my_dict.iteritems())\n\nIf you are using python 2.7 or 3.x you can use a dictionary comprehension instead:\nmy_dict2 = {y:x for x,y in my_dict.iteritems()}\n\nEdit\nAs noted in the comments by JBernardo, for python 3.x you need to use items instead of iteritems\n"}
{"input": "What is a generative method? I'm familiar with Python generators, however I've just come across the term \"generative method\" which I am not familiar with and cannot find a satisfactory definition.\nTo put it in context, I found the term in SQLAlchemy's narrative documentation:\n\nFull control of the \u00e2\u0080\u009cautocommit\u00e2\u0080\u009d behavior is available using the generative Connection.execution_options() method provided on Connection, Engine, Executable, using the \u00e2\u0080\u009cautocommit\u00e2\u0080\u009d flag which will turn on or off the autocommit for the selected scope.\n\nWhat is a generative method? Trying to iterate the object returned by Connection.execution_options() doesn't work so I'm figuring it's something other than a standard generator.\n", "output": "It doesn't appear to be a common database concept, but SQLAlchemy uses the term generative in the sense \"generated by your program iteratively at runtime\". (So, no connection to python generators). An example from the tutorial:\n\nThe Query object is fully generative, meaning that most method calls\n  return a new Query object upon which further criteria may be added.\n  For example, to query for users named \u00e2\u0080\u009ced\u00e2\u0080\u009d with a full name of \u00e2\u0080\u009cEd\n  Jones\u00e2\u0080\u009d, you can call filter() twice, which joins criteria using AND:\n\n>>> for user in session.query(User).\\\n...   filter(User.name=='ed').\\\n...   filter(User.fullname=='Ed Jones'):\n...     print user\n\nThis call syntax is more commonly known as \"method chaining\", and the design that allows it as a \"fluent interface\". \nSo, in the case of Connection.execution_options(), \"generative\" means that it returns the modified connection object, so that you can chain the calls as above. \n"}
{"input": "How to generate audio from a numpy array? I want to create \"heart rate monitor\" effect from a 2D array in numpy and want the tone to reflect the values in the array.\n", "output": "You can use the write function from scipy.io.wavfile to create a wav file which you can then play however you wish. Note that the array must be integers, so if you have floats, you might want to scale them appropriately:\nimport numpy as np\nfrom scipy.io.wavfile import write\n\ndata = np.random.uniform(-1,1,44100) # 44100 random samples between -1 and 1\nscaled = np.int16(data/np.max(np.abs(data)) * 32767)\nwrite('test.wav', 44100, scaled)\n\nIf you want Python to actually play audio, then this page provides an overview of some of the packages/modules.\n"}
{"input": "how can I set the last modified time of a file from python? I have a python script that downloads a file over FTP using ftplib.\nMy current download code looks just like the example in the ftp lib docs:\nftp.retrbinary('RETR README', open('README', 'wb').write)\n\nNow I have a requirement that the file downloaded over FTP needs to have the same last modified time as the file on the FTP server itself. Assuming I could parse out the time from ftp.retrlines('list'), how can I set the modified time on the downloaded file?\nI'm on a unix based OS if that matters.\n", "output": "If you want to do this directly from python, you're looking for os.utime. The docs can give you more info.\n"}
{"input": "Parsing apache log files I just started learning Python and would like to read an Apache log file and put parts of each line into different lists.\nline from the file\n\n172.16.0.3 - - [25/Sep/2002:14:04:19 +0200] \"GET / HTTP/1.1\" 401 - \"\" \"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827\"\n\naccording to Apache website the format is \n\n%h %l %u %t \\\"%r\\\" %>s %b \\\"%{Referer}i\\\" \\\"%{User-Agent}i\\\n\nI'm able to open the file and just read it as it is but I don't know how to make it read in that format so I can put each part in a list.\n", "output": "This is a job for regular expressions.\nFor example:\nline = '172.16.0.3 - - [25/Sep/2002:14:04:19 +0200] \"GET / HTTP/1.1\" 401 - \"\" \"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827\"'\nregex = '([(\\d\\.)]+) - - \\[(.*?)\\] \"(.*?)\" (\\d+) - \"(.*?)\" \"(.*?)\"'\n\nimport re\nprint re.match(regex, line).groups()\n\nThe output would be a tuple with 6 pieces of information from the line (specifically, the groups within parentheses in that pattern):\n('172.16.0.3', '25/Sep/2002:14:04:19 +0200', 'GET / HTTP/1.1', '401', '', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827')\n\n"}
{"input": "Is there a direct approach to format numbers in jinja2? I need to format decimal numbers in jinja2. \nWhen I need to format dates, I call the strftime() method in my template, like this:\n{{ somedate.strftime('%Y-%m-%d') }}\n\nI wonder if there is a similar approach to do this over numbers.\nThanks in advance!\n", "output": "You can do it simply like this, the Python way:\n{{ '%04d' % 42 }}\n\n{{ 'Number: %d' % variable }}\n\nOr using that method:\n{{ '%d' | format(42) }}\n\nI personally prefer the first one since it's exactly like in Python.\n"}
{"input": "Pretty print 2D Python list Is there a simple, built-in way to print a 2D Python list as a 2D matrix?\nSo this:\n[[\"A\", \"B\"], [\"C\", \"D\"]]\n\nwould become something like\nA    B\nC    D\n\nI found the pprint module, but it doesn't seem to do what I want.\n", "output": "To make things interesting, let's try with a bigger matrix:\nmatrix = [\n   [\"Ah!\",  \"We do have some Camembert\", \"sir\"],\n   [\"It's a bit\", \"runny\", \"sir\"],\n   [\"Well,\",  \"as a matter of fact it's\", \"very runny, sir\"],\n   [\"I think it's runnier\",  \"than you\",  \"like it, sir\"]\n]\n\ns = [[str(e) for e in row] for row in matrix]\nlens = [max(map(len, col)) for col in zip(*s)]\nfmt = '\\t'.join('{{:{}}}'.format(x) for x in lens)\ntable = [fmt.format(*row) for row in s]\nprint '\\n'.join(table)\n\nOutput:\nAh!                     We do have some Camembert   sir            \nIt's a bit              runny                       sir            \nWell,                   as a matter of fact it's    very runny, sir\nI think it's runnier    than you                    like it, sir  \n\nUPD: for multiline cells, something like this should work:\ntext = [\n    [\"Ah!\",  \"We do have\\nsome Camembert\", \"sir\"],\n    [\"It's a bit\", \"runny\", \"sir\"],\n    [\"Well,\",  \"as a matter\\nof fact it's\", \"very runny,\\nsir\"],\n    [\"I think it's\\nrunnier\",  \"than you\",  \"like it,\\nsir\"]\n]\n\nfrom itertools import chain, izip_longest\n\nmatrix = chain.from_iterable(\n    izip_longest(\n        *(x.splitlines() for x in y), \n        fillvalue='') \n    for y in text)\n\nAnd then apply the above code.\nSee also http://pypi.python.org/pypi/texttable\n"}
{"input": "Python exception handling - line number I'm using python to evaluate some measured data. Because of many possible results it is difficult to handle or possible combinations. Sometimes an error happens during the evaluation. It is usually an index error because I get out of range from measured data.\nIt is very difficult to find out on which place in code the problem happened. It would help a lot if I knew on which line the error was raised. If I use following code:\ntry:\n    result = evaluateData(data)\nexcept Exception, err:\n    print (\"Error: %s.\\n\" % str(err))\n\nUnfortunately this only tells me that there is and index error. I would like to know more details about the exception (line in code, variable etc.) to find out what happened. Is it possible?\nThank you.\n", "output": "Solution, printing filename, linenumber, line itself and exception descrition:\nimport linecache\nimport sys\n\ndef PrintException():\n    exc_type, exc_obj, tb = sys.exc_info()\n    f = tb.tb_frame\n    lineno = tb.tb_lineno\n    filename = f.f_code.co_filename\n    linecache.checkcache(filename)\n    line = linecache.getline(filename, lineno, f.f_globals)\n    print 'EXCEPTION IN ({}, LINE {} \"{}\"): {}'.format(filename, lineno, line.strip(), exc_obj)\n\n\ntry:\n    print 1/0\nexcept:\n    PrintException()\n\nOutput:\nEXCEPTION IN (D:/Projects/delme3.py, LINE 15 \"print 1/0\"): integer division or modulo by zero\n\n"}
{"input": "flask sqlalchemy querying a column with not equals I can query my seats table for all seats where there is no invite assigned:\nseats = Seat.query.filter_by(invite=None).all()\n\nHowever, when querying for all seats that have an invite assigned, I get an err:\nseats = Seat.query.filter_by(invite!=None).all()\n\nNameError: name 'invite' is not defined\n\nHere is my Seat class:\nclass Seat(db.Model):\n    id = db.Column(db.Integer, primary_key=True)\n\n    invite_id = db.Column(db.Integer, db.ForeignKey('invite.id'))\n    invite = db.relationship('Invite',\n        backref=db.backref('folks', lazy='dynamic'))\n\nHow can I query for all seats where the owner is not blank?\nThanks\n", "output": "The filter_by() method takes a sequence of keyword arguments, so you always have to use '=' with it. You want to use the filter() method which allows for !=.\nseats = Seat.query.filter(Seat.invite != None).all()\n\nor \nseats = Seat.query.filter(Seat.invite_id != None).all()\n\n"}
{"input": "u'\\ufeff' in Python string I get an error with the following patter:\nUnicodeEncodeError: 'ascii' codec can't encode character u'\\ufeff' in position 155: ordinal not in range(128)\n\nNot sure what u'\\ufeff' is, it shows up when I'm web scraping. How can I remedy the situation? The .replace() string method doesn't work on it. \n", "output": "The Unicode character U+FEFF is the byte order mark, or BOM, and is used to tell the difference between big- and little-endian UTF-16 encoding.  If you decode the web page using the right codec, Python will remove it for you.  Examples:\n#!python2\n#coding: utf8\nu = u'ABC'\ne8 = u.encode('utf-8')        # encode without BOM\ne8s = u.encode('utf-8-sig')   # encode with BOM\ne16 = u.encode('utf-16')      # encode with BOM\ne16le = u.encode('utf-16le')  # encode without BOM\ne16be = u.encode('utf-16be')  # encode without BOM\nprint 'utf-8     %r' % e8\nprint 'utf-8-sig %r' % e8s\nprint 'utf-16    %r' % e16\nprint 'utf-16le  %r' % e16le\nprint 'utf-16be  %r' % e16be\nprint\nprint 'utf-8  w/ BOM decoded with utf-8     %r' % e8s.decode('utf-8')\nprint 'utf-8  w/ BOM decoded with utf-8-sig %r' % e8s.decode('utf-8-sig')\nprint 'utf-16 w/ BOM decoded with utf-16    %r' % e8s.decode('utf-16')\nprint 'utf-16 w/ BOM decoded with utf-16le  %r' % e8s.decode('utf-16le')\n\nNote that EF BB BF is a UTF-8-encoded BOM.  It is not required for UTF-8, but serves only as a signature (usually on Windows).\nOutput:\nutf-8     'ABC'\nutf-8-sig '\\xef\\xbb\\xbfABC'\nutf-16    '\\xff\\xfeA\\x00B\\x00C\\x00'    # Adds BOM and encodes using native processor endian-ness.\nutf-16le  'A\\x00B\\x00C\\x00'\nutf-16be  '\\x00A\\x00B\\x00C'\n\nutf-8  w/ BOM decoded with utf-8     u'\\ufeffABC'    # doesn't remove BOM if present.\nutf-8  w/ BOM decoded with utf-8-sig u'ABC'          # removes BOM if present.\nutf-16 w/ BOM decoded with utf-16    u'ABC'          # *requires* BOM to be present.\nutf-16 w/ BOM decoded with utf-16le  u'\\ufeffABC'    # doesn't remove BOM if present.\n\nNote that the utf-16 coded requires BOM to be present, or Python won't know if the data is big- or little-endian.\n"}
{"input": "flake8 complains on boolean comparison \"==\" in filter clause I have a boolean field in the mysql db table. \n# table model\nclass TestCase(Base):\n    __tablename__ = 'test_cases'\n    ...\n    obsoleted = Column('obsoleted',  Boolean)\n\nTo get the count of all the non-obsoleted test cases, that can be done simply like this:\ncaseNum = session.query(TestCase).filter(TestCase.obsoleted == False).count()\nprint(caseNum)\n\nThat works fine, but the flake8 report the following warning:\n\nE712: Comparison to False should be \"if cond is False:\" or \"if not\n  cond:\"\n\nOkay, I think that make sense. So change my code to this:\ncaseNum = session.query(TestCase).filter(TestCase.obsoleted is False).count()\n\nor \ncaseNum = session.query(TestCase).filter(not TestCase.obsoleted).count()\n\nBut neither of them can work.  The result is always 0.\nI think the filter clause doesn't support the operator \"is\" or \"is not\". Will someone can tell me how to handle this situation. I don't want to disable the flake. \n", "output": "That's because SQLAlchemy filters are one of the few places where == False actually makes sense. Everywhere else you should not use it.\nAdd a # noqa comment to the line and be done with it.\nOr you can use sqlalchemy.sql.expression.false:\nfrom sqlalchemy.sql.expression import false\n\nTestCase.obsoleted == false()\n\nwhere false() returns the right value for your session SQL dialect. There is a matching sqlalchemy.expression.true.\n"}
{"input": "python subprocess changing directory I want to execute a script inside a subdirectory/superdirectory (I need to be inside this sub/super-directory first). I can't get subprocess to enter my subdirectory:\ntducin@localhost:~/Projekty/tests/ve$ python\nPython 2.7.4 (default, Sep 26 2013, 03:20:26) \n[GCC 4.7.3] on linux2\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n>>> import subprocess\n>>> import os\n>>> os.getcwd()\n'/home/tducin/Projekty/tests/ve'\n>>> subprocess.call(['cd ..'])\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\n  File \"/usr/lib/python2.7/subprocess.py\", line 524, in call\n    return Popen(*popenargs, **kwargs).wait()\n  File \"/usr/lib/python2.7/subprocess.py\", line 711, in __init__\n    errread, errwrite)\n  File \"/usr/lib/python2.7/subprocess.py\", line 1308, in _execute_child\n    raise child_exception\nOSError: [Errno 2] No such file or directory\n\nPython throws OSError and I don't know why. It doesn't matter whether I try to go into an existing subdir or go one directory up (as above) - I always end up with the same error.\n", "output": "What your code tries to do is call a program named cd ... What you want is call a command named cd.\nBut cd is a shell internal. So you can only call it as\nsubprocess.call('cd ..', shell=True) # pointless code! See text below.\n\nBut it is pointless to do so. As no process can change another process's working directory (again, at least on a UNIX-like OS, but as well on Windows), this call will have the subshell change its dir and exit immediately.\nWhat you want can be achieved with os.chdir() or with the subprocess named parameter cwd which changes the working directory immediately before executing a subprocess.\n"}
{"input": "Matplotlib log scale tick label number formatting With matplotlib when a log scale is specified for an axis, the default method of labeling that axis is with numbers that are 10 to a power eg. 10^6. Is there an easy way to change all of these labels to be their full numerical representation? eg. 1, 10, 100, etc.\nNote that I do not know what the range of powers will be and want to support an arbitrary range (negatives included).\n", "output": "Sure, just change the formatter.\nFor example, if we have this plot:\nimport matplotlib.pyplot as plt\n\nfig, ax = plt.subplots()\nax.axis([1, 10000, 1, 100000])\nax.loglog()\n\nplt.show()\n\n\nYou could set the tick labels manually, but then the tick locations and labels would be fixed when you zoom/pan/etc.  Therefore, it's best to change the formatter:\nfrom matplotlib.ticker import ScalarFormatter\nfor axis in [ax.xaxis, ax.yaxis]:\n    axis.set_major_formatter(ScalarFormatter())\n\n\n"}
{"input": "How does the min/max function on a nested list work? Lets say, there is a nested list, like:\nmy_list = [[1, 2, 21], [1, 3], [1, 2]]\n\nWhen the function min() is called on this:\nmin(my_list)\n\nThe output received is \n[1, 2]\n\nWhy and How does it work? What are some use cases of it?\n", "output": "How are lists and other sequences compared in Python?\nLists (and other sequences) in Python are compared lexicographically and not based on any other parameter.\n\nSequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted.\n\n\nWhat is lexicographic sorting?\nFrom the Wikipedia page on lexicographic sorting \n\nlexicographic or lexicographical order (also known as lexical order, dictionary order, alphabetical order or lexicographic(al) product) is a generalization of the way the alphabetical order of words is based on the alphabetical order of their component letters.\n\nThe min function returns the smallest value in the iterable. So the lexicographic value of [1,2] is the least in that list. You can check by using [1,2,21]\n>>> my_list=[[1,2,21],[1,3],[1,2]]\n>>> min(my_list)\n[1, 2]\n\n\nWhat is happening in this case of min?\nGoing element wise on my_list, firstly [1,2,21] and [1,3]. Now from the docs \n\nIf two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively.\n\nThus the value of [1,1,21] is less than [1,3], because the second element of [1,3], which is, 3 is lexicographically higher than the value of the second element of [1,1,21], which is, 1. \nNow comparing [1,2] and [1,2,21], and adding another reference from the docs\n\nIf one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one.\n\n[1,2] is an initial sub-sequence of [1,2,21]. Therefore the value of [1,2] on the whole is smaller than that of [1,2,21]. Hence [1,2] is returned as the output. \nThis can be validated by using the sorted function\n>>> sorted(my_list)\n[[1, 2], [1, 2, 21], [1, 3]]\n\n\nWhat if the list has multiple minimum elements?\nIf the list contains duplicate min elements the first is returned\n>>> my_list=[[1,2],[1,2]]\n>>> min(my_list)\n[1, 2]\n\nThis can be confirmed using the id function call\n>>> my_list=[[1,2],[1,2]]\n>>> [id(i) for i in my_list]\n[140297364849368, 140297364850160]\n>>> id(min(my_list))\n140297364849368\n\n\nWhat do I need to do to prevent lexicographic comparison in min?\nIf the required comparison is not lexicographic then the key argument can be used (as mentioned by Padraic)\nThe min function has an additional optional argument called key. The key argument takes a function.\n\nThe optional key argument specifies a one-argument ordering function\n  like that used for list.sort(). The key argument, if supplied, must be\n  in keyword form (for example, min(a,b,c,key=func)).\n\nFor example, if we need the smallest element by length, we need to use the len function. \n>>> my_list=[[1,2,21],[1,3],[1,2]]\n>>> min(my_list,key=len)            # Notice the key argument\n[1, 3]\n\nAs we can see the first shortest element is returned here. \n\nWhat if the list is heterogeneous?\nUntil Python2\nIf the list is heterogeneous type names are considered for ordering, check Comparisions, \n\nObjects of different types except numbers are ordered by their type names\n\nHence if you put an int and a list there you will get the integer value as the smallest as i is of lower value than l. Similarly '1' would be of higher value than both of this. \n>>> my_list=[[1,1,21],1,'1']\n>>> min(my_list)\n1\n\nPython3 and onwards\nHowever this confusing technique was removed in Python3. It now raises a TypeError. Read What's new in Python 3.0 \n\nThe ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don\u00e2\u0080\u0099t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense \u00e2\u0080\u0093 all the elements must be comparable to each other.\n\n>>> my_list=[[1,1,21],1,'1']\n>>> min(my_list)\nTraceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nTypeError: unorderable types: int() < list()\n\nBut it works for Comparable types, For example\n>>> my_list=[1,2.0]\n>>> min(my_list)\n1\n\nHere we can see that the list contains float values and int values. But as float and int are comparable types, min function works in this case. \n"}
{"input": "py2exe + sqlalchemy + sqlite problem I am playing around with getting some basic stuff to work in Python before i go into full speed dev mode. Here are the specifics:\nPython 2.5.4\nPyQt4 4.4.3\nSqlAlchemy 0.5.2\npy2exe 0.6.9\nsetuptools 0.6c9\npysqlite 2.5.1\n\nsetup.py:\nfrom distutils.core import setup\nimport py2exe\n\nsetup(windows=[{\"script\" : \"main.py\"}], options={\"py2exe\" : {\"includes\" : [\"sip\", \"PyQt4.QtSql\",\"sqlite3\"],\"packages\":[\"sqlite3\",]}})\n\npy2exe appears to generate the .exe file correctly, but when i execute dist/main.exe i get this in the main.exe.log\nTraceback (most recent call last):\n  File \"main.py\", line 18, in <module>\n  File \"main.py\", line 14, in main\n  File \"db\\manager.pyc\", line 12, in __init__\n  File \"sqlalchemy\\engine\\__init__.pyc\", line 223, in create_engine\n  File \"sqlalchemy\\engine\\strategies.pyc\", line 48, in create\n  File \"sqlalchemy\\engine\\url.pyc\", line 91, in get_dialect\nImportError: No module named sqlite\n\nI've been googling my heart out, but can't seem to find any solutions to this. If i can't get this to work now, my hopes of using Python for this project will be dashed and i will start over using Ruby... (not that there is anything wrong with Ruby, i just wanted to use this project as a good way to teach myself Python)\n", "output": "you need to include the sqlalchemy.databases.sqlite package\nsetup(\n  windows=[{\"script\" : \"main.py\"}],\n  options={\"py2exe\" : {\n    \"includes\": [\"sip\", \"PyQt4.QtSql\"],\n    \"packages\": [\"sqlalchemy.databases.sqlite\"]\n}})\n\n"}
{"input": "How to execute a process remotely using python I want to connect too and execute a process on a remote server using python.  I want to be able to get the return code and stderr(if any) of the process.  Has anyone ever done anything like this before.  I have done it with ssh, but I want to do it from python script.\nCheers.\n", "output": "Use the ssh module called paramiko which was created for this purpose instead of using subprocess. Here's an example below:\nfrom paramiko import SSHClient\nclient = SSHClient()\nclient.load_system_host_keys()\nclient.connect(\"hostname\", username=\"user\")\nstdin, stdout, stderr = client.exec_command('program')\nprint \"stderr: \", stderr.readlines()\nprint \"pwd: \", stdout.readlines()\n\nUPDATE: The example used to use the ssh module, but that is now deprecated and paramiko is the up-to-date module that provides ssh functionality in python.\n"}
{"input": "Delete multiple files matching a pattern I have made an online gallery using Python and Django. I've just started to add editing functionality, starting with a rotation. I use sorl.thumbnail to auto-generate thumbnails on demand.\nWhen I edit the original file, I need to clean up all the thumbnails so new ones are generated. There are three or four of them per image (I have different ones for different occasions). \nI could hard-code in the file-varients... But that's messy and if I change the way I do things, I'll need to revisit the code.\nIdeally I'd like to do a regex-delete. In regex terms, all my originals are named like so:\n^(?P<photo_id>\\d+)\\.jpg$\n\nSo I want to delete:\n^(?P<photo_id>\\d+)[^\\d].*jpg$\n\n(Where I replace photo_id with the ID I want to clean.)\n", "output": "Try something like this:\nimport os, re\n\ndef purge(dir, pattern):\n    for f in os.listdir(dir):\n    \tif re.search(pattern, f):\n    \t\tos.remove(os.path.join(dir, f))\n\nThen you would pass the directory containing the files and the pattern you wish to match.\n"}
{"input": "Python parsing bracketed blocks What would be the best way in Python to parse out chunks of text contained in matching brackets?\n\"{ { a } { b } { { { c } } } }\"\n\nshould initially return:\n[ \"{ a } { b } { { { c } } }\" ]\n\nputting that as an input should return:\n[ \"a\", \"b\", \"{ { c } }\" ]\n\nwhich should return:\n[ \"{ c }\" ]\n\n[ \"c\" ]\n\n[]\n\n", "output": "Or this pyparsing version:\n>>> from pyparsing import nestedExpr\n>>> txt = \"{ { a } { b } { { { c } } } }\"\n>>>\n>>> nestedExpr('{','}').parseString(txt).asList()\n[[['a'], ['b'], [[['c']]]]]\n>>>\n\n"}
{"input": "How to apply itertools.product to elements of a list of lists? I have a list of arrays and I would like to get the cartesian product of the elements in the arrays.\nI will use an example to make this more concrete...\nitertools.product seems to do the trick but I am stuck in a little detail.\narrays = [(-1,+1), (-2,+2), (-3,+3)];\n\nIf I do\ncp = list(itertools.product(arrays));\n\nI get \ncp = cp0 = [((-1, 1),), ((-2, 2),), ((-3, 3),)]\n\nBut what I want to get is\ncp1 = [(-1,-2,-3), (-1,-2,+3), (-1,+2,-3), (-1,+2,+3), ..., (+1,+2,-3), (+1,+2,+3)].\n\nI have tried a few different things:\ncp = list(itertools.product(itertools.islice(arrays, len(arrays))));\ncp = list(itertools.product(iter(arrays, len(arrays))));\n\nThey all gave me cp0 instead of cp1.\nAny ideas?\nThanks in advance.\n", "output": ">>> arrays = [(-1,+1), (-2,+2), (-3,+3)]\n>>> list(itertools.product(*arrays))\n[(-1, -2, -3), (-1, -2, 3), (-1, 2, -3), (-1, 2, 3), (1, -2, -3), (1, -2, 3), (1, 2, -3), (1, 2, 3)]\n\n"}
{"input": "Plotting implicit equations in 3d I'd like to plot implicit equation F(x,y,z) = 0 in 3D. Is it possible in Matplotlib?\n", "output": "You can trick matplotlib into plotting implicit equations in 3D.  Just make a one-level contour plot of the equation for each z value within the desired limits.  You can repeat the process along the y and z axes as well for a more solid-looking shape.\nfrom mpl_toolkits.mplot3d import axes3d\nimport matplotlib.pyplot as plt\nimport numpy as np\n\ndef plot_implicit(fn, bbox=(-2.5,2.5)):\n    ''' create a plot of an implicit function\n    fn  ...implicit function (plot where fn==0)\n    bbox ..the x,y,and z limits of plotted interval'''\n    xmin, xmax, ymin, ymax, zmin, zmax = bbox*3\n    fig = plt.figure()\n    ax = fig.add_subplot(111, projection='3d')\n    A = np.linspace(xmin, xmax, 100) # resolution of the contour\n    B = np.linspace(xmin, xmax, 15) # number of slices\n    A1,A2 = np.meshgrid(A,A) # grid on which the contour is plotted\n\n    for z in B: # plot contours in the XY plane\n        X,Y = A1,A2\n        Z = fn(X,Y,z)\n        cset = ax.contour(X, Y, Z+z, [z], zdir='z')\n        # [z] defines the only level to plot for this contour for this value of z\n\n    for y in B: # plot contours in the XZ plane\n        X,Z = A1,A2\n        Y = fn(X,y,Z)\n        cset = ax.contour(X, Y+y, Z, [y], zdir='y')\n\n    for x in B: # plot contours in the YZ plane\n        Y,Z = A1,A2\n        X = fn(x,Y,Z)\n        cset = ax.contour(X+x, Y, Z, [x], zdir='x')\n\n    # must set plot limits because the contour will likely extend\n    # way beyond the displayed level.  Otherwise matplotlib extends the plot limits\n    # to encompass all values in the contour.\n    ax.set_zlim3d(zmin,zmax)\n    ax.set_xlim3d(xmin,xmax)\n    ax.set_ylim3d(ymin,ymax)\n\n    plt.show()\n\nHere's the plot of the Goursat Tangle:\ndef goursat_tangle(x,y,z):\n    a,b,c = 0.0,-5.0,11.8\n    return x**4+y**4+z**4+a*(x**2+y**2+z**2)**2+b*(x**2+y**2+z**2)+c\n\nplot_implicit(goursat_tangle)\n\n\nYou can make it easier to visualize by adding depth cues with creative colormapping:\n\nHere's how the OP's plot looks:\ndef hyp_part1(x,y,z):\n    return -(x**2) - (y**2) + (z**2) - 1\n\nplot_implicit(hyp_part1, bbox=(-100.,100.))\n\n\nBonus:  You can use python to functionally combine these implicit functions:\ndef sphere(x,y,z):\n    return x**2 + y**2 + z**2 - 2.0**2\n\ndef translate(fn,x,y,z):\n    return lambda a,b,c: fn(x-a,y-b,z-c)\n\ndef union(*fns):\n    return lambda x,y,z: np.min(\n        [fn(x,y,z) for fn in fns], 0)\n\ndef intersect(*fns):\n    return lambda x,y,z: np.max(\n        [fn(x,y,z) for fn in fns], 0)\n\ndef subtract(fn1, fn2):\n    return intersect(fn1, lambda *args:-fn2(*args))\n\nplot_implicit(union(sphere,translate(sphere, 1.,1.,1.)), (-2.,3.))\n\n\n"}
{"input": "Internals of Python list, access and resizing runtimes Is Python's [] a list or array. Is the access time of an index O(1) like an array or O(n) like a list, and is appending/resizing O(1) like a list or O(n) like an array, or is it a hybrid that can manage O(1) for accessing and resizing? \nI read somewhere (http://norvig.com/python-lisp.html) that array access is really slow in Python. However, when I wrote a memoized version of a recursive fibonacci procedure using both a dictionary (Python's dictionary is suppose to be really fast) and list, they had equal times. Why is this? Does a Python tuple have faster access times than a python list?\n", "output": "Python's [] is implemented as an array, not a linked list. Although resizing is O(n), appending to it is amortized O(1), because resizes happen very rarely. If you're not familiar with how this works, read this Wikipedia entry on dynamic arrays. Python's list doesn't expand by a factor of 2 each time, it's a bit more complicated than that, but the expansions are still designed make appending amortized O(1).\nInserting in the middle, however, is always an inefficient O(n), because n items may have to be moved. \nTuples aren't faster than lists - they're just immutable lists under the hood (*).\nRegarding your dictionary test: depending on your exact implementation, caching in a list will be faster than with a dict. However, Python's dicts are highly optimized, and especially for small amounts of keys will perform great.\n\n(*) Here's a list's \"get item\" C function in Python 2.6:\nPyObject *\nPyList_GetItem(PyObject *op, Py_ssize_t i)\n{\n    if (!PyList_Check(op)) {\n        PyErr_BadInternalCall();\n        return NULL;\n    }\n    if (i < 0 || i >= Py_SIZE(op)) {\n        if (indexerr == NULL)\n            indexerr = PyString_FromString(\n                \"list index out of range\");\n        PyErr_SetObject(PyExc_IndexError, indexerr);\n        return NULL;\n    }\n    return ((PyListObject *)op) -> ob_item[i];\n}\n\nAnd this is a tuple's:\nPyObject *\nPyTuple_GetItem(register PyObject *op, register Py_ssize_t i)\n{\n    if (!PyTuple_Check(op)) {\n        PyErr_BadInternalCall();\n        return NULL;\n    }\n    if (i < 0 || i >= Py_SIZE(op)) {\n        PyErr_SetString(PyExc_IndexError, \"tuple index out of range\");\n        return NULL;\n    }\n    return ((PyTupleObject *)op) -> ob_item[i];\n}\n\nAs you can see, they're almost exactly the same. In the end, after some type and bound checking, it's a simple pointer access with an index.\n"}
{"input": "Why can't generators be pickled? Python's pickle (I'm talking standard Python 2.5/2.6/2.7 here) cannot pickle locks, file objects etc. \nIt also cannot pickle generators and lambda expressions (or any other anonymous code), because the pickle really only stores name references.\nIn case of locks and OS-dependent features, the reason why you cannot pickle them is obvious and makes sense.\nBut why can't you pickle generators?\n\nNote: just for clarity -- I'm interested in the fundamental reason (or assumptions and choices that went into that design decision) why, not in \"because it gives you a Pickle error\". \nI realize the question's a bit wide-aimed, so here's a rule of thumb of whether your answered it: \"If these assumptions were raised, or the type of allowed generator somehow more restricted, would pickling generators work again?\"\n", "output": "There is lots of information about this available. For the \"official word\" on the issue, read the (closed) Python bugtracker issue.\nThe core reasoning, by one of the people who made the decision, is detailed on this blog:\n\nSince a generator is essentially a souped-up function, we would need to save its bytecode, which is not guarantee to be backward-compatible between Python\u00e2\u0080\u0099s versions, and its frame, which holds the state of the generator such as local variables, closures and the instruction pointer. And this latter is rather cumbersome to accomplish, since it basically requires to make the whole interpreter picklable. So, any support for pickling generators would require a large number of changes to CPython\u00e2\u0080\u0099s core.\nNow if an object unsupported by pickle (e.g., a file handle, a socket, a database connection, etc) occurs in the local variables of a generator, then that generator could not be pickled automatically, regardless of any pickle support for generators we might implement. So in that case, you would still need to provide custom __getstate__ and __setstate__ methods. This problem renders any pickling support for generators rather limited.\n\nAnd two suggested workarounds are mentioned:\n\nAnyway, if you need for a such feature, then look into Stackless Python which does all the above. And since Stackless\u00e2\u0080\u0099s interpreter is picklable, you also get process migration for free. This means you can interrupt a tasklet (the name for Stackless\u00e2\u0080\u0099s green threads), pickle it, send the pickle to a another machine, unpickle it, resume the tasklet, and voil\u00c3\u00a0 you\u00e2\u0080\u0099ve just migrated a process. This is freaking cool feature!\nBut in my humble opinion, the best solution to this problem to the rewrite the generators as simple iterators (i.e., one with a __next__ method). Iterators are easy and efficient space-wise to pickle because their state is explicit. You would still need to handle objects representing some external state explicitly however; you cannot get around this.\n\n"}
{"input": "Create (sane/safe) filename from any (unsafe) string I want to create a sane/safe filename (i.e. somewhat readable, no \"strange\" characters, etc.) from some random Unicode string (mich might contain just anything).\n(It doesn't matter for me wether the function is Cocoa, ObjC, Python, etc.)\n\nOf course, there might be infinite many characters which might be strange. Thus, it is not really a solution to have a blacklist and to add more and more to that list over the time.\nI could have a whitelist. However, I don't really know how to define it. [a-zA-Z0-9 .] is a start but I also want to accept unicode chars which can be displayed in a normal way.\n", "output": "Python:\n\"\".join([c for c in filename if c.isalpha() or c.isdigit() or c==' ']).rstrip()\n\nthis accepts Unicode characters but removes line breaks, etc.\nexample:\nfilename = u\"ad\\nbla'{-+\\)(\u00c3\u00a7?\"\n\ngives: adbla\u00c3\u00a7\nedit\nstr.isalnum() does alphanumeric on one step. \u00e2\u0080\u0093 comment from queueoverflow below. danodonovan hinted on keeping a dot included.\n    keepcharacters = (' ','.','_')\n    \"\".join(c for c in filename if c.isalnum() or c in keepcharacters).rstrip()\n\n"}
{"input": "python class instance variables and class variables im having a problem understanding how class / instance variables work in python. I dont understand why when i try this code the list variable seems to be a class variable\nclass testClass():\n    list = []\n    def __init__(self):\n        self.list.append('thing')\n\np = testClass()\nprint p.list\n\nf = testClass()\nprint f.list\n\noutput:\n['thing']\n['thing', 'thing']\n\nand when i do this it seems to be an instance variable\nclass testClass():\n    def __init__(self):\n        self.list = []\n        self.list.append('thing')\n\np = testClass()\nprint p.list\n\nf = testClass()\nprint f.list\n\noutput:\n['thing']\n['thing']\n\nmany thanks \njon\n", "output": "This is because of the way Python resolves names with the .. When you write self.list the Python runtime tries to resolve the list name first looking for it in the instance object, and if not found in the class instance.\nLet's look into it step by step\nself.list.append(1)\n\n\nIs there a list name into the object self?\n\nYes: Use it! Finish.\nNo: Go to 2.\n\nIs there a list name into the class instance of object self?\n\nYes: Use it! Finish\nNo: Error!\n\n\nBut when you bind a name things are different:\nself.list = []\n\n\nIs there a list name into the object self?\n\nYes: Overwrite it!\nNo: Bind it!\n\n\nSo, that is always an instance variable.\nYour first example creates a list into the class instance, as this is the active scope at the time (no self anywhere). But your second example creates a list explicitly in the scope of self.\nMore interesting would be the example:\nclass testClass():\n    list = ['foo']\n    def __init__(self):\n        self.list = []\n        self.list.append('thing')\n\nx = testClass()\nprint x.list\nprint testClass.list\ndel x.list\nprint x.list\n\nThat will print:\n['thing']\n['foo']\n['foo']\n\nThe moment you delete the instance name the class name is visible through the self reference.\n"}
{"input": "What does '@reify' do and when should it be used? I saw it in the Pyramid tutorial for UX design. I couldn't make out much what this decorator is all about.\nSample code where I saw its usage.\ndef __init__(self, request):\n    self.request = request\n    renderer = get_renderer(\"templates/global_layout.pt\")\n    self.global_template = renderer.implementation().macros['layout']\n\n@reify\ndef company_name(self):\n    return COMPANY\n\n@reify\ndef site_menu(self):\n    new_menu = SITE_MENU[:]\n    url = self.request.url\n    for menu in new_menu:\n        if menu['title'] == 'Home':\n            menu['current'] = url.endswith('/')\n        else:\n            menu['current'] = url.endswith(menu['href'])\n    return new_menu\n\n@view_config(renderer=\"templates/index.pt\")\ndef index_view(self):\n    return {\"page_title\": \"Home\"}\n\n@view_config(renderer=\"templates/about.pt\", name=\"about.html\")\ndef about_view(self):\n    return {\"page_title\": \"About\"}\n\n", "output": "From the source code documentation:\n\n\"\"\" Put the result of a method which uses this (non-data)\n      descriptor decorator in the instance dict after the first call,\n      effectively replacing the decorator with an instance variable.\"\"\"\n\nA description from from the fuzzy notepad blog sums it up nicely.\n\nIt acts like @property, except that the function is only ever called\n  once; after that, the value is cached as a regular attribute. This\n  gives you lazy attribute creation on objects that are meant to be\n  immutable.\n\nSo in the code you posted, site_menu can be accesses like a cached property.  \n"}
{"input": "Check if an object exists I need to check if Model.objects.filter(...) turned up anything, but do not need to insert anything. My code so far is:\nuser_pass = log_in(request.POST)  # form class\nif user_pass.is_valid():\n    cleaned_info = user_pass.cleaned_data\n    user_object = User.objects.filter(email = cleaned_info['username'])\n\n", "output": "I think the easiest from a logical and efficiency point of view is using the queryset's exists() function, documented here:\nhttps://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.exists\nSo in your example above I would simply write:\nif User.objects.filter(email = cleaned_info['username']).exists():\n    # at least one object satisfying query exists\nelse:\n    # no object satisfying query exists\n\n"}
{"input": "Is the PySide Slot Decorator Necessary? I've seen some example code for PySide slots that uses the @QtCore.Slot decorator, and some that does not.  Testing it myself, it doesn't seem to make a difference.  Is there a reason I should or should not use it?  For example, in the following code:\nimport sys\nfrom PySide import QtCore\n\n# the next line seems to make no difference\n@QtCore.Slot()\ndef a_slot(s):\n    print s\n\nclass SomeClass(QtCore.QObject):\n    happened = QtCore.Signal(str)\n    def __init__(self):\n        QtCore.QObject.__init__(self)\n    def do_signal(self):\n        self.happened.emit(\"Hi.\")\n\nsc = SomeClass()\nsc.happened.connect(a_slot)\nsc.do_signal()\n\nthe @QtCore.Slot decorator makes no difference; I can omit it, call @QtCore.Slot(str), or even @QtCore.Slot(int), and it still nicely says, \"Hi.\"\nThe same seems to be true for PyQt's pyqtSlot.\n", "output": "This link explains the following about the pyqtSlot decorator:\n\nAlthough PyQt4 allows any Python callable to be used as a slot when\n  connecting signals, it is sometimes necessary to explicitly mark a\n  Python method as being a Qt slot and to provide a C++ signature for\n  it. PyQt4 provides the pyqtSlot() function decorator to do this.\n\nand\n\nConnecting a signal to a decorated Python method also has the\n  advantage of reducing the amount of memory used and is slightly\n  faster.\n\nSince the pyqtSlot decorator can take additional argument such as name, it allows different Python methods to handle the different signatures of a signal.\nIf you don't use the slot decorator, the signal connection mechanism has to manually work out all the type conversions to map from the underlying C++ function signatures to the Python functions.  When the slot decorators are used, the type mapping can be explicit.\n"}