Today I cleaned up the database for Fiddle Salad and Python Fiddle. Both use the same Django back-end for code storage. While browsing tags, I noticed that often both CamelCase and lowercase spellings were used for tags. Since I was working on a tag suggest feature earlier this week, I decided to convert all tags to lowercase so that tag suggestions would not be redundant. An additional benefit is further normalization of the data. Fortunately, I found a fork of django-taggit, the Django app I used for tagging, that supported enforcing lowercase tags everywhere. Two management commands were already present for normalizing data, mergetags and lowercasetags. django-taggit had two fields for each tag, a name and slug. lowercasetags converted all tag names to their lowercase form. mergetags takes at least two tag slugs and merges all tags into a single destination tag. The result is that all associations are moved to a single tag. While mergetags is suitable for manually resolving redundant data, the number of tags on Fiddle Salad is too large. I wrote an command to automate this process:
from django.core.management.baseimport BaseCommand, CommandError from taggit.modelsimport Tag, TaggedItem from django.core.exceptionsimport ObjectDoesNotExist
class Command(BaseCommand): help='merges all tags automatically'
def merge(self, extra_slugs, dest_slug): try:
dest_tag = Tag.objects.get(slug=dest_slug) except ObjectDoesNotExist: raise CommandError('Destination Tag "%s" does not exist' % dest_slug)
for slug in extra_slugs: try:
tag = Tag.objects.get(slug=slug) except ObjectDoesNotExist: raise CommandError('Tag "%s" does not exist' % slug)
items = TaggedItem.objects.filter(tag=tag)
count = items.count() for i, item inenumerate(items): if i % 20==0: self.stdout.write('Merging %s %d/%d\n' % (slug, i+1, count))
obj = item.content_object ifnot obj: return
obj.tags.remove(tag)
obj.tags.add(dest_tag)
tag.delete()
self.stdout.write('Successfully merged tags into "%s"\n' % dest_slug)
def handle(self, *args, **options): for tag in Tag.objects.all(): if Tag.objects.filter(name=tag.name).count()>1:
tags = Tag.objects.filter(name=tag.name).order_by('id')
dest = tags[0].slug
extras =[] for tag in tags[1::]:
extras.append(tag.slug) self.merge(extras, dest)
Because performance is not a concern for a single-time data processing script, I did not bother to optimize the queries nor run-time. This script would be useful for anyone who wants to normalize tags in the same manner, so it is in a git repository. Finally, I tested the new command on a clone of the production database.
bash-4.1$ python manage.py lowercasetags
Lowercasing 1/1621
Lowercasing 21/1621
.
.
.
Lowercasing 1621/1621
bash-4.1$ python manage.py mergealltags
Merging jquery_1 1/46
Merging jquery_1 21/46
Merging jquery_1 41/46
Successfully merged tags into "jquery"
Successfully merged tags into "jquery"
Merging stylus_1 1/7
Successfully merged tags into "stylus"
Merging hello_1 1/10
Successfully merged tags into "hello"
Merging test_1 1/147
Merging test_1 21/147
Merging test_1 41/147
Merging test_1 61/147
Merging test_1 81/147
Merging test_1 101/147
Merging test_1 121/147
Merging test_1 141/147
Successfully merged tags into "test"
Merging me_1 1/2
Successfully merged tags into "me"
Merging no_1 1/5
Successfully merged tags into "no"
Merging one_1 1/16
Successfully merged tags into "one"
Merging things_1 1/3
Successfully merged tags into "things"
Merging learning_1 1/4
Successfully merged tags into "learning"
Successfully merged tags into "body"
Merging week-one_1 1/1
Successfully merged tags into "week-one"
Merging studio_1 1/33
Merging studio_1 21/33
Successfully merged tags into "studio"
Merging internet_1 1/36
Merging internet_1 21/36
Successfully merged tags into "internet"
Merging assignment_1 1/6
Successfully merged tags into "assignment"
Merging homework_1 1/6
Successfully merged tags into "homework"
Merging lessons_1 1/1
Successfully merged tags into "lessons"
Merging code_1 1/12
Merging tags_1 1/7
Successfully merged tags into "tags"
Merging two_1 1/6
Successfully merged tags into "two"
Merging salcedo_1 1/3
Successfully merged tags into "salcedo"
Merging page_1 1/15
Successfully merged tags into "page"
Merging music_1 1/4
Successfully merged tags into "music"
Merging table_1 1/7
Successfully merged tags into "table"
Merging band_1 1/9
Merging texas_1 1/1
Successfully merged tags into "texas"
Merging biography_1 1/2
Merging assignment-two_1 1/2
Successfully merged tags into "assignment-two"
Merging website_1 1/9
Merging a_1 1/2
Successfully merged tags into "a"
Merging words_1 1/1
Successfully merged tags into "words"
Merging section_1 1/2
Successfully merged tags into "section"
Merging header_1 1/1
Successfully merged tags into "header"
Merging ui_1 1/4
Successfully merged tags into "ui"
Merging first_1 1/8
Successfully merged tags into "first"
Merging random_1 1/1
Successfully merged tags into "random"
Merging internet-studio_1 1/5
Successfully merged tags into "internet-studio"
Merging angularjs_1 1/11
Successfully merged tags into "angularjs"
Merging i_1 1/2
Successfully merged tags into "i"
Merging lines_1 1/1
Successfully merged tags into "lines"
Merging row_1 1/1
Successfully merged tags into "row"
Merging alex-alpha_1 1/1
Successfully merged tags into "alex-alpha"
Merging assignment-one_1 1/2
Successfully merged tags into "assignment-one"
Merging google_1 1/1
Successfully merged tags into "google"
Merging man_1 1/4
Successfully merged tags into "man"
Merging nick_1 1/1
Successfully merged tags into "nick"
Merging cartoon_1 1/1
Successfully merged tags into "cartoon"
Merging batman_1 1/2
Successfully merged tags into "batman"
Merging code_1 1/7
Merging the_1 1/1
Successfully merged tags into "the"
Merging animation_1 1/2
Successfully merged tags into "animation"
Merging band_1 1/4
Merging assignment-one-of-three_1 1/2
Successfully merged tags into "assignment-one-of-three"
Merging status_1 1/1
Successfully merged tags into "status"
Merging python_1 1/2
Successfully merged tags into "python"
Merging cat_1 1/1
Successfully merged tags into "cat"
Merging none_1 1/7
Successfully merged tags into "none"
Merging adam_1 1/2
Successfully merged tags into "adam"
Merging school_1 1/3
Successfully merged tags into "school"
Merging website_1 1/9
Merging biography_1 1/2
Merging bootstrap_1 1/8
Successfully merged tags into "bootstrap"
Merging datamill_1 1/5
Successfully merged tags into "datamill"
Merging gentoo_1 1/2
Successfully merged tags into "gentoo"
Merging dobschal_1 1/1
Successfully merged tags into "dobschal"
Merging weimar_1 1/1
Successfully merged tags into "weimar"
When all went fine, I ran lowercasetags and mergealltags on both Fiddle Salad and Python Fiddle. Now I was really impressed with the results as I clicked through the tags on both sites. The tags on Fiddle Salad were much better organized as they were ordered by popularity. While looking through the tags, I noticed that “test” was among the top. I decided to add ‘test’ to the list of stopwords for django-taggit. These stopwords are removed during save so that they are not associated with new snippets.
Now that the tags are normalized, I am ready to move on and deploy tag suggestions.
I made this video in the summer and Python Fiddle back in 2010. It is still the best Python IDE on the web running in a browser. So here is a demo if you’re new to Python:
The repository of Python code snippets on Python Fiddle now has 742 posted and counting. Among them, there are several for listing prime numbers. The naive approach by factoring definitely isn’t efficient:
Another approach also contains the hidden double for loop:
Way to go with this one, packing a little program into the regular expression:
The work done on Fiddle Salad this month would not have been possible without last month’s planning. Furthermore, Fiddle Salad would not have been my idea if I did not invest time in building Python Fiddle. Python Fiddle was really the end product of 9 years of dreams of running a high performance computer and the result of my experience using Gentoo Linux. So I bought a computer to build Python Fiddle, which also turned out to be necessary to run the latest IDE and development tools to build Fiddle Salad. When I started working with the Python interpreter in JavaScript, it was horrendously slow. It took about 20 seconds to load and took up almost 1GB of memory. Any text editor except Vim without syntax highlighting was quick enough to edit the 12MB source code file.
Fiddle Salad is an evolution of both the original idea and code base that belonged to Python Fiddle. Now it is really Fiddle Salad that’s driving the development of Python Fiddle, because they share much of the code base.
So this is the third major milestone, which I almost gave up on before I embarked on it. Before I started work on this milestone, actually a day or two before I planned, I suddenly noticed huge, discouraging signs. They came as shocking surprises. For example, I discovered a hidden option in an application I have used often before that had some of the functionality I was going to build. If that wasn’t enough, it was actually quite popular and many people probably knew that feature. As another example, I discovered another application that was more innovative in certain aspects than the application I planned to build. I got still more examples, but they aren’t worth repeating here.
As a habit, I reached for my next plan and the best tools I have available. I then realized that I would be throwing away about 8 months of work and the plans for this month, which worked out so well. Although I had no reason and no incentive at all to work on Fiddle Salad, I did so only because I enjoyed every moment of it. I believe that’s what we are all here for, the very drumbeat of the universe.
In the end, those serious signs got swallowed up by my project, as I managed to either include their ideas or integrate them right into it. Fiddle Salad is really the culmination and peak of all live web development environments, having the best features in all of them and in my imagination.
While wrapping up the Fiddle Salad project and doing cross browser testing, I found that Firefox wouldn’t run my project, at least on a local host.
This was one of the reasons I couldn’t get a stable release all in one shot. I initially thought it was caused by the IP address or the port number, but others report it’s just a problem with not having a domain. So one way is to add a domain mydomain.com in the Windows host file in C:\Windows\System32\drivers\etc.
# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
# 102.54.94.97 rhino.acme.com # source server
# 38.25.63.10 x.acme.com # x client host
# localhost name resolution is handled within DNS itself.
127.0.0.1 localhost
127.0.0.1 myapp.com
The changes are applied immediately after saving, and I’m able to run the site locally in Firefox.
Another way is to use one of the Worker polyfills that bypasses the Firefox security checks. I suggest fakeworker, but in my case I would need to rewrite some code to be compatible with the old API to use it. Of course, this was only one of the many problems I found on the uploaded version, so I had to prioritize which problems to fix first. It’s always better to go for the efficiency gains at the start and visible results at the end. So I made a Django media sync javascript debug processor for future debug purposes on the production site.
Creating clean, readable code is a primary imperative when working with large systems. Today, I have factored the knockout view model from a single object for both projects into an inheritance based hierarchy. Now in the middle of it, I’m replacing a programmingLanguage variable with a language class that also handles the language for the style sheet (LESS, SCSS) and document (Jade, ECO).
var Language = Class.$extend({
__init__:function(){
LANGUAGE ={
PYTHON:0,
JAVASCRIPT:1,
LESS:2,
CSS:3,
HTML:4 };
LANGUAGE_TYPE ={
STYLE:1, } } })
Then I paused and thought it didn’t seem right. I remember Enums were more terse in C. So I decided to us an Enum class.
function Enum(){ var obj =newObject(); for(var i =0; i < arguments.length; i++){
obj[arguments[i]]= i; } return obj; }
var Language = Class.$extend({
__init__:function(){
LANGUAGE = Enum('PYTHON','JAVASCRIPT','LESS','CSS','HTML');
LANGUAGE_TYPE = Enum('SCRIPT','STYLE','DOCUMENT'); } })
Now I like that much better. It’s like automatic numbering, except I don’t have to read it.
I just had an aha! insight 5 minutes ago when working on the design to be implemented next month. The previous storage format used string splitting and joining. Everyone who’s worked with it recognizes the mistake immediately, though the solution didn’t come until I spent a bit of time designing the system.
Previously, get_code and set_code belonged to the abstract factory. Looking at the way things are used, it may better facilitate the implementation of the local history feature if a code storage class was used. I took a snapshot when I got the idea for the JSON format, so things are still a bit disorganized:
The realization is that this design doesn’t require the back-end to store all the different languages used in the fiddle. It allows both PythonFiddle, which stores only Python, and FiddleSalad, which mixes languages, to use the same storage backend. They already do, but I’m getting ready to go to the next stage. The /python/ and /coffeescript/ URL structure is still good for SEO and linking purposes.
The initial release of PythonFiddle attracted a lot of attention due to an article on SlashDot, one of the best places to post general technology news. Although the first version featured cutting edge technology that would run a Python interpreter in the browser, the general consensus is that it’s good for sharing Python code on the web, but not much else. With some afterthought (or maybe forethought, because this was the original intention), I made a new version for web development.
The new PythonFiddle aims to solve problems with JavaScript by offering Python as a replacement. Developers prefer class based inheritance to JavaScript’s prototypal inheritance, mostly because it’s mainstream. Writing applications with classes built into the language is helpful in large projects, along with the removal of global scope. For small projects, Python’s pseudo-code like syntax is preferable to the ancient C syntax.
The large collection of third-party tools in JavaScript is not overlooked, as in the case of Google’s Dart programming language. JavaScript libraries such as jQuery can be used directly, others can be added as external resources. With PythonFiddle, web developers who use Python server-side are now able to use the same language client-side. Besides the Python to JavaScript compiler, other advantages PythonFiddle offer include live reloading of the page, Less, and Zen Coding.
A recent site I launched received a lot of attention. It may have been the only way to get the momentum going, since Google wouldn’t index a site with no content.
Fortunately, I spent a week before launch optimizing the page loading, serving static files from Amazon, and fixing usability bugs. So the result is a very smooth launch, even when serving many visitors per second. Some users who experienced slow loading issues may have been waiting for the browser to download a 1.3 or 2.0 MB file, which could have caused a traffic jam on a static file server. The technique used here was to serve files that are already compressed with lzma and gzip, respectively. Due to htaccess configuration not being available on Amazon, it was decided to serve these from another server.
The most surprising effect was that Google seemed to have picked up the link as soon as it appeared, along with other sites that mirror content.