Python Web开发中常用的第三方库

Thursday, November 28, 2013

Python Web开发中常用的第三方库

TL;DR

经常有朋友问,如果用Python来做Web开发,该选用什么框架?用Pyramid开发Web该选用怎样的组合等问题?在这里我将介绍一些Python Web开发中常用的第三方库。基本适用于Django以外的Web框架(Pyramid, Flask, Tornado, Web.py, Bottle等).

ORM

  • SQLAlchemy, 在ORM方面,首选SQLAlchemy,没有之一!
    支持SQLite, PostgreSQL, MySQL, Oracle, MS-SQL, Firebird, Sybase等主流关系数据库系统
    支持的Python环境有Python2、Python3,PyPy以及Jython。
    主要的特性请移步 Key Features of SQLAlchemy
    推荐和数据库迁移工具Alemic搭配使用

  • MongoEngine, 如果你用MongoDB,推荐MongoEngine.

Template Engine

在模板引擎方便选择也是比较多, 有ChameleonJinja2Mako等可供选择,用过ChameleonJinja2,性能都非常好.

Form Engine

Cache Engine & Session Store

  • Beaker 缓存和Session管理首选Beaker, 没有之一! 可以搭配文件、dbm、memcached、内存、数据库、NoSQL等作为存储后端. 如果你用Pyramid作为Web框架,那么可以直接使用pyramid_beaker.

Others

环境构建

任务队列

  • Celery (芹菜)一个分布式异步任务队列, 很强大!
  • RQ 这是一个轻量级的任务队列,基于Redis, 可以尝试一下。

WebServer

工具

  • Fabric, 可以通过它完成自动化部署和常规的运维等工作。《Fabric-让部署变得简单》_PPT
  • Supervisor 一个强大的进程管理工具,用来管理各种服务(比如Gunicorn、Celery等),服务挂掉时 Supervisor 会帮自动重启服务。

导出报表数据

  • Tablib,这个挺好用,支持导出Excel, JSON, YAML, HTML, TSV, CSV格式数据, 我创建了一个Pyramid插件可以集成到Pyramid项目中使用 pyramid_tablib
  • 导出PDF有reportlabPyPDF2

第三方身份验证

  • velruse, 支持各大网站的身份验证, 国内部分我已经加入了WeiboDoubanQQTaobaoRenren,并merge到主版本库中。欢迎使用!

Helper

To Be Continued...

升级PostgreSQL 9.2 -> 9.3

Thursday, November 14, 2013

PostgreSQL发布9.3了, brew upgrade postgresql 升级到9.3, 竟然启动不起来, 查看日志发现原来9.2的数据格式不兼容,需要迁移一下数据, 碰到这个问题的同学可以看一下 :-)

错误日志, 数据不兼容

/usr/local(master ) tail -f /usr/local/var/postgres/server.log
FATAL:  database files are incompatible with server
DETAIL:  The data directory was initialized by PostgreSQL version 9.2, which is not compatible with this version 9.3.1.

解决办法

PostgreSQL提供了一个升级迁移脚本 pg_upgrade, 用来迁移数据

pg_upgrade -b oldbindir -B newbindir -d olddatadir -D newdatadir [option...]

1. 新建一个PostgreSQL9.3的数据目录

/usr/local/var(master ) mv postgres postgres9.2
/usr/local/var(master ) initdb /usr/local/var/postgres -E utf8

2. 迁移数据到新目录中

/usr/local/var(master ) pg_upgrade \
-b /usr/local/Cellar/postgresql/9.2.4/bin/ \
-B /usr/local/Cellar/postgresql/9.3.1/bin/ \
-d /usr/local/var/postgres9.2 \
-D /usr/local/var/postgres \
-v

最后迁移完成打印下面的信息就代表迁移成功了

...
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok

Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    delete_old_cluster.sh

3. 启动PostgreSQL9.3

查看版本和数据

/usr/local/var(master ) run_postgresql
server starting
/usr/local/var(master ) psql postgres
psql (9.3.1)
Type "help" for help.

postgres=# \l

4. 删除老版本和数据

删除数据和刚刚执行pg_upgrade产生的两个脚本

/usr/local/var(master ) rm -rf analyze_new_cluster.sh delete_old_cluster.sh postgres9.2

卸载PostgreSQL9.2.4

brew cleanup postgresql

搞定!

QCon上海2013大会流水账

Monday, November 4, 2013

QCon2013Shanghai

Day 1


上午是来自Twitter,LinkedIn,Github等大公司的四场英文主题演讲,演讲内容也比较泛,英文不好也太听明白,借了个同声传译的耳机,翻译质量也很一般很多术语翻得不准,听得费劲, 后来就听原声了。

  • 第一场是有来自Twitter的Raffi做的《Twitter面向服务的架构之路》, 介绍了Twitter这样一个高速变革高速发展的系统中,维持高并发而采取的一系列解决方案,以及管理系统复杂性所采取的一些设计理念. 其中讲到他们的RPC框架Finagle,有高并发需求的同学可以研究一下, 他们的Timeline cache也是用Redis在做。

  • 第二个主题演讲是来自Linkedin, 数据产品化,作为一个全球最大的职业类SNS,介绍他们如何通过数据进行产品化的思路,并展示了一些相关算法模型。 数据沉淀到一定规模后其实都应该考虑数据产品话,推荐系统是提高转化率的一个好方式。

  • 第三个主题演讲是来自Github的分享: 干掉产品经理, 大多数公司都会设置一个产品主管或者一堆产品经理来决定产品要有哪些功能特性, 但是,有一些企业正在抛开产品经理, 完全让开发者来决定应该实现哪些功能.
    当然Github的团队水平相当高,产品也特殊,这样一个产品工程师每天都要用,所以MicroSoft的“Eat Your Own Dogfood”很重要.

  • 上午最后一个的讲机器的同理心(Mechanical Sympathy),大概是通过赛车行业的例子来讲在软件开发中的一些理念,不明觉厉哪

下午是专题演讲

第一场在《知名网站案例分析》专题会场听的阿里外贸团队在解决跨境网站中遇到的一些SEO及CDN的问题和解决方案,在SEO对性能优化方便他们通过Google Ajax异步兼容的方式来对系统进行优化,在页面中加入一个meta标记<meta name="fragment" content="!">, 爬虫发现页面含有这个标记会把URL变成htt://xxxx?_escaped_fragment_=, 程序根据?_escaped_fragment标记返回给爬虫快照,这个办法会形成两次请求,他们表示对现有的10%的爬虫占比可以接受, 其中提到通过Agent来判断是否为爬虫是不符合Google规范的,存在降权风险等。 对于地区差异大的网站,图片占大部分的下载资源,所以CDN的架构相当关键, 对于全部图片同步产生带宽成本大问题, 他们采用了同步主要图片(商品第一屏图片)的变通的方式提高用户体验。

后来跑去听《团队文化》专题了,这个专题主要是讲工程师文化以及技术管理中的一些探讨。有来自Github,游戏公司、豆瓣以及座谈讨论会。

豆瓣通过code平台的故事来分享了豆瓣的工程师文化,讲到工程师自发的创建code这样一个项目,慢慢的发展起来成为了豆瓣工程师每天依赖的工作平台,里面讲到一个有趣的事情这个项目并没有产品负责人,在一年的时间里没有全职的工程师投入,大多数需求呗提出来后,几天内就会有工程师主动将其实现,如果安排一个全职的负责人来负责这个code项目,负责人可能为了刷存在感,总会开发些不实用的功能,那么这个项目也许发展不下去了,哈哈哈, 干掉产品经理!!! :-P

在座谈会中有一些不错的观点

  • 创始人的文化就是公司的文化
  • 大牛带小牛是最高效的成长方式
  • 小团队更适合杠杆率高的行业

Day 2


上午我主要是在《推荐系统》专题会场听推荐系统相关的分享,迟到了第一场只听了后半部分,第一场是来自百度的,也是这个专题我觉得讲的比较好的。
主要介绍百度在推荐系统上的实践,有相关推荐、个性化推荐、tag浏览等,通过用户建模、item建模、关联、个性化推荐等策略。根据应用需求和数据特点不断调整算法策略。

下午在《移动应用案例分析》专题会场听了豌豆荚在Android方便的技术研究以及搜狐新闻客户端后端架构的演进和Push系统,讲他们随着发展步伐如何做技术选型,技术架构等。 后面也没有特别感兴趣的主题加上又特别困,就回家了

Day 3


上午在《扩展性、可用性和高性能》专题会场听了篱笆网的技术演进、唯品会如何在很短时间内实现支持5倍流量的系统以及新浪微博分享的基于单元化架构的高性能服务实践。

  • 篱笆网主要是分享了他们如何解决数据访问层的性能优化和架构选型,同时也成就了在国内的互联网界Cassandra这样一个NoSQL产品的成功案例
  • 唯品会分享了在他们做大促销前的准备工作, 面对存在大量历史问题系统是如何做到支持5倍
  • 上午的最后一个演讲是来自新浪微博关于单元化架构的实践,通过单元化架构并行计算、数据本地化等方式来提高性能。

下午第一场是一个老外讲企业创新,这哥们后面还做了一个可穿戴计算的生态圈的介绍,中间有演示Google Glass,Facebook前端工程师Hedger Wang介绍碎片化终端整合的思考,下午场最喜欢这个演讲了,介绍了Web App和Native app的一些选择,如何更好的跨终端设计,以及Web App在跨终端的一些解决方案。 其中讲到到底是Web还是Native,Web的优势是广度的,当用户越来越多的时间花在你的app上的时候,我们应该把他带到Native上。 我觉得Web和Native都要有,呵呵,在资源不够的时候应该先Web再Native。 后面有讲到应该用Web Components的方式来解决跨终端web问题,而不是每个终端做一个相同功能的产品,通过Web Components方式来渲染适合各种终端的展现,这个不错有空要研究下。

后面几个是跨界演讲,应该算Lighting Talk 鬼脚七分享了他如何做自媒体,蔡学镛分享了他的成长经历,以及Roy历分享黑客的自我修养,这几个Lighting Talk听起来要轻松些。

总结


本次大会的内容主要集中在大公司的大架构分享, 云计算和高并发等, 缺少Startup相关的分享,三天的大会时间有点长,整个听下来比较累,还是有不少收获的,见到了好多老朋友和认识了一些新朋友,比较喜欢的Topic有:

【转】与成功学大师对话

Friday, March 15, 2013

原文链接: http://book.douban.com/review/2043761/

我已经三十出头了。我虽然赚得了人生的第一桶金,也颇受业内人士的称赞,可胡润那个排行榜上还看不到我的名字。即使比起榜上最后一名,我的资产总额还差人家一个数量级。

我读过《高效能人士的七种习惯》,知道按步就班地接近目标;我也读过《细节决定成败》,懂得谨小慎微地苦心经营;我还读过《第五项修炼》,努力把自己的团队建设成“学习型组织”。可是,我总觉得自己还没有参透生意经,于是四处寻访高人指点。经朋友介绍,我到香港拜会过南怀瑾,也到泰国求教过白龙王。这二位爷给我讲了一番大道理,听后我却觉得不知所云。

土菩萨拜过了,只有求洋神仙。我的师弟小田在美国留学,一次他提到该国的成功学大师科鲁奇。据说此公著作等身,妙语连珠,风靡全美。“通用”以前的老总韦尔奇是他的好友,投资家巴菲特也是他府上的常客。几经反复,小田终于帮我联络上了科鲁奇大师。他答应给我提供一小时咨询,要价一万美金。一万就一万吧,只要能借我一双慧眼。

会面安排在加州某处度假胜地,原来大师在这里置了一套别墅。只见窗外阳光棕榈,沙滩美女,室内则书册满架,茶香扑鼻,另是一番光景。大师须发灰白,却红光满面,精神矍铄。我呈上一万美金的支票,另加一套国产高档瓷器,权作见面礼。大师点头收下,张口就杀我个下马威:“你看上去还很年轻,何必急着发财?”   我心下不快,却不动声色,说了句漂亮话:“弟子知晓富贵如浮云过眼,乃身外之物。只是若不及早成就一番事功,此生虚度就太可惜了。恳请大师不吝赐教。”

“好。”大师颔首微笑,“你们中国历史悠久,中国人也喜欢听故事。我就给你讲四个段子如何?每个段子揭示一件成功法宝。你听后自然有所领悟。”

冰球选手的生日

大师呷了口茶,接着问道:“你有没有看过冰球比赛?”

“没有,只在电视上见过。”

“我小时候在加拿大长大,这项运动在那里很盛行。有一年,我和新婚妻子去看两支劲旅的决赛。因为离比赛开始还有一段时间,我们就拿了份介绍选手背景的资料。我只粗粗扫了几眼,妻子却有所发现——‘你看,绝大部分冰球手出生在冬天!’

“我觉得她大惊小怪,便仔细查看了球手的生日。不错,大部分球手的生日都集中在一至三月,尤其以一月为多。我正感到奇怪,妻子已经根据女人的直觉提出了解释。她说冬天出生的人属于魔羯座或水瓶座,前者踏实稳重,富有毅力;后者灵动活泼,勇于创新。这两种人在冰球这种团队比赛中最容易胜出。他们分别担当后卫和前锋,就能组成一支强大的球队。你觉得这种解释合理吗?”

我简单答道:“弟子不相信星座之说,不过,我也常听人讲起性格如何决定命运。”

大师道:“起初我也觉得冬天出生的人具有某种特殊秉赋。不过,我很快联想起自己早年入学的日期要求。加拿大政府规定,新年那一天满七周岁的孩子才可以入读小学。你如果出生在一月二号,即使离七周岁就差一天,也必须等到来年才能入学,而你的同学则有可能比你小十二个月。你一定猜到了,因为几个月的年龄差距,冬天出生的孩子就比班上大部分同学长得高出一截。”

“不错,可这又说明什么?”

“加拿大的教练一般从十岁左右的孩子中挑选少年选手,再加以培养。那些冬天出生的孩子因为入学晚,比别人高出一截,体格也强壮一些,就容易被选入冰球培训队,尤其是那些最好的球队。他们因而有了一流的教练作指导,也有一流的球员互相切磋。他们十分清楚自己的职业前景,因此付出的努力也是普通少年球手的两三倍。等到十四五岁,这些“冬生冰球手”的水平已经明显超过那些夏秋两季出生的孩子,他们也就有机会晋升到省级和国家级的顶尖球队中。你看,起初一点优势,就会被命运逐渐放大。

“所以,我说的第一件法宝不是性格,而是势差。”

披头士与脱衣舞

大师书房里摆放着一套视听设备,他凝神望着其中一台硕大的音箱,突然问我:“你们这代中国青年大都听过摇滚乐,你是否记得这种流行音乐的始祖?”

“我想是披头士吧,不过我不怎么听他们的歌曲。”

“没错。我年轻的时候,披头士红极一时。我当时在加州作嬉皮,头发留得很长,整天抱着吉它,背着录音机,和朋友们开派对。”说到这里,大师不禁莞尔。“我虽是披头士的粉丝,内心也有几分妒嫉。我自忖音乐细胞不少于约翰-列侬,也曾和其他伙伴共同组建过乐队,可我们吸引的姑娘远远少于披头士招来的女观众。披头士为什么能成名?我找来该乐队成员的传记,谜底终于被我发现了。

“披头士没出道以前,只是一帮爱弄摇滚的英国混混。一天,幸运女神光顾了他们。一个利物浦商人邀请他们去德国汉堡演出——别以为是什么正儿巴经的场合,不过为脱衣舞娘伴奏罢了。有个叫布诺的汉堡人拥有一家脱衣舞酒吧,当时德国还没有摇滚乐队,他去伦敦雇佣演奏者,碰巧遇到那个利物浦商人。事情就这样发生了。披头士在汉堡赚了点小钱,搞了搞女人,大概跟今天北京三里屯酒吧里的乐手也没什么区别。

“大师居然知道三里屯?”

“呵呵,我两年前去北京讲学。一天晚上,美国大使馆一位老朋友在那儿请我喝了点啤酒。接着说披头士。你知道,脱衣舞酒吧里的顾客音乐品位都不高,老板给得报酬也不多。可有一样,这种酒吧在德国的营业时间很长,披头士必须没日没夜地弹唱。日积月累,来美国发展之前,他们已经拥有了七年的演出经验,演奏技巧也达到了相当高超的境界。再加上列侬的创造天份,四人一炮打响也就不难理解了。”

“那您当年为何没有去脱衣酒吧应聘?”话一出口,我即知失言,不由吐了一下舌头。

大师却不介意:“我们嬉皮在性爱方面很开放,但不屑于出入这类声色场所。我和伙伴们试过几家唱片公司,全都碰了钉子,败兴而归。如果当年能拥有一个小小的演出场所,或许我也能在娱乐界扬名立万呢。

“所以,我说得第二件法宝不是技巧,而是舞台。”

盖茨生逢其时

讲到这里,大师悠悠叹了口气:“我最大的遗憾还不是没有成为摇滚乐明星,而是错过了当比尔-盖茨这类人物的机会。”

我一惊,差点把手中的茶碗掉到地上。

“盖茨的故事不用我重复了。不过,你有没有注意过他是哪年出生的?”

“如果弟子没有记错,应该是1955年10月28日。”盖茨的传记我买了不只一本,曾经反复研读。

“不错,他和我一样,是金秋十月出生的。不过不像冰球手,这其中倒没有奥秘。”大师谈兴正浓,接着讲道:“你若是读过点技术史,就知道1975年1月是硅谷最重要的时刻。正是那时,8800型个人电脑诞生了,成为当月《大众电子》杂志(Popular Electronics)的封面故事。不少人都看出电脑市场蕴含的巨大商机,可谁会先下海呢?如果你年纪够大,很可能已经在IBM之类的老牌公司谋到职位,再去自己创业机会成本太大;如果你年纪太小,恐怕还没有掌握必要的IT技能。因此,你的年纪必须恰到好处,才能显出英雄本色。盖茨当时正好从哈佛肄业,不过二十出头。他既懂编写软件,又是初生牛犊,于是抓住了黄金商机。

“同样靠微软挤进福布斯排行榜的富人们,只比盖茨大一点或小一点。保罗-艾伦生于1953年,他在高中电脑房里就认识盖茨;斯蒂夫-鲍尔默生于1956年,得以在哈佛结识盖茨。我再举两个例子:苹果电脑的创始人乔布斯生于1955年,谷歌现任CEO施密特也生于这一幸运年份。

“请问大师的生辰?”我不禁好奇。

“哈哈,我生于1950年,不过我也不是没有机会参加IT界革命。我过了几年流浪的嬉皮生活,于1972年申请就读瑞德大学(Reed College)的电子工程专业。那一年乔布斯也搬了进来,我们在校园里还打过几次照面。可他比我有决断,一年后就辍学了,后来跑到硅谷闯出了一片天地。本科毕业后,我又鬼使神差地对社会学发生了兴趣,在东岸一所常青藤大学读了个博士。如此一来,个人电脑时代的所有商机都被我错过了。”

“这么说,盖茨并非绝顶聪明?”

“盖茨当然是个聪明人,不过他的智慧也并非高得不可想像。我成名之后,他找我做过一两次咨询。那时他已然是世界首富,在事业方面我就没有再提建议。我和他聊起美国企业家的发家史,他反应很快,我提个大概,他就猜出每位富豪的经营诀窍。可他的机敏程度比我的好友德鲁克还是慢了半拍。

“所以,我说得第三件法宝不是智商,而是时运。”

塞翁失马的弗洛姆

讲到这里,一个小时就快到了。大师抬头看了看墙上的挂钟,告诉我不要着急:“我会把第四个段子讲完,否则你要跟我打官司了。这最后一个故事嘛,也和律师有关系。

“我从东部那所大学毕业后,想先到社会上打拼一番,就去曼哈顿上城一家律师事务所作了实习生。那家律所的合伙人中,有一位身材五短,背还有点驼,名叫弗洛姆。你别看他其貌不扬,以前还是哈佛法学院的高材生呢。不过,弗洛姆五十年代毕业后,因为他的犹太背景和主流社会不太搭调,有段时间找不到工作。最后,他加入了一家新成立的律所。这家公司是如此不起眼,以致于什么上门的活儿都接。可是,它成长迅速,如今已经拥有两千名律师,年收入高达十亿美元。

“我作实习生的时候,有一次吃饭碰到老板弗洛姆,就向他请教发家秘诀。他告诉我你们中国一句古话——塞翁失马,焉知非福。五十年代的华尔街还有点贵族风度,有点名气的律所都不愿意接“敌意收购”(hostile takeover)这类脏活。如果实在不好退却,他们就把脏活转包给弗洛姆的公司。转眼到了七十年代,金融管制放松了,信贷资金充裕了,投资者也变得气势汹汹了。这一切都推动了企业收购大潮。现在所有的律所都愿意接并购案了,不过你可以想到,只有弗洛姆的公司做得最为出色——因为他们已经积累了近二十年的从业经验。

“我在八十年代初见到弗洛姆时,他的生意如此火爆,连我这种非科班出身的学生都被招了进来。那餐饭快吃完的时候,他对我讲‘行行出状元。你一旦干这行有了名声,人们就会首先想到你。’”

我问道:“那么,大师为什么后来又离开了那家律所?”

“通过弗洛姆,我终于明白,我再也不能跟在别人后面打工了。我必须做点别人没有做过的事情,于是决定钻研杰出人物成功的秘密。三十年后,你不是也找到我了吗?现在,轮到你这个中国人尝试些新领域了。

“所以,我说得第四种要素不是能力,而是先机。”

大师讲完这四个段子,起身走向窗前,若有所思地吟出一句唐诗:“只在此山中,云深不知处。”他转过头对我说:“我年轻时一直以为,那些大人物是靠他们的真才实学成功的;现在才明白,真才实学也要靠外在的机缘才能造就。你总以为自己修炼不够,其实还未将大局看透。”

我就此拜别了科鲁奇大师。出门后,师弟小田问我,这一万美金花得值不值。我平日里精打细算,这时却冒出一句偈语:“运用之妙,存乎一心”。

[后记] 《非同凡响》(outliers,中文版译为“异类”,殊觉不妥)是去年年底在美国出版的一部非虚构类畅销书。据说此书的中译本在台湾出版后并未大卖,我的一位编辑朋友猜想,可能是此类图书不适合华人读者口味的缘故。我于是想到通过一位虚拟的成功学大师,来讲述书中的四个成才故事。有些读者也许已经发现,我参考的文本主要有两个:古龙的《七种武器》和金庸的《雪山飞狐》。

此文已刊于《优势》杂志创刊号。
作者博客:blog.sina.com.cn/tianfm

用Buildout来构建Python项目

Tuesday, March 12, 2013

什么是Buildout

alt Buildout
(Remixed by Matt Hamilton, original from http://xkcd.com/303)

Buildout是一个基于Python的构建工具, 通过一个配置文件,可以从多个部分创建、组装并部署你的应用,即使应用包含了非Python的组件,Buildout也能够胜任. Buildout不但能够像setuptools一样自动更新或下载安装依赖包,而且还能够像virtualenv一样,构建一个封闭隔离的开发环境.

初始化Buildout

首先我们新建一个目录来共享Buildout配置和文件:

~/Projects$ mkdir buildout
~/Projects$ cd buildout

下载一个2.0的bootstrap.py脚本:

~/Projects/buildout$ wget http://downloads.buildout.org/2/bootstrap.py

然后创建一个Buildout的配置文件:

~/Projects/buildout$ touch buildout.cfg

运行bootstrap.py来生成Buildout相关的文件和目录:

~/Projects/buildout$ python bootstrap.py
Creating directory '/Users/Eric/Projects/buildout/bin'.
Creating directory '/Users/Eric/Projects/buildout/parts'.
Creating directory '/Users/Eric/Projects/buildout/eggs'.
Creating directory '/Users/Eric/Projects/buildout/develop-eggs'.
Generated script '/Users/Eric/Projects/buildout/bin/buildout'.

从上面可以看出,创建了目录bin,parts,eggs,develop-eggs,在bin目录下生成了buildout脚本:

  • bin目录用来存放生成的脚本文件
  • parts目录存放生成的数据,大多用不上
  • develop-eggs 存放指向开发目录的链接文件。和buildout.cfg中develop选项相关
  • eggs 是存放从网络上下载下来的egg包。这些包一般在buildout.cfg中的egg选项里定义

把Python和Pyramid集成进来

配置Buildout

~/Projects/buildout$ vim buildout.cfg
[buildout]
# 每个buildout都要有一个parts列表,也可以为空。
# parts用来指定构建什么。如果parts中指定的段中还有parts的话,会递归构建。
parts = tools

[tools]
# 每一段都要指定一个recipe, recipe包含python的代码,用来安装这一段,
# zc.recipe.egg就是把一些把下面的egg安装到eggs目录中
recipe = zc.recipe.egg
# 定义python解释器
interpreter = python
# 需要安装的egg
eggs =
    pyramid

执行buildout命令来构建一下, 这将会把Pyramid集成进来:

~/Projects/buildout$ bin/buildout

用buildout来构建项目

现在可以创建Pyramid应用了:

~/Projects/buildout$ bin/pcreate -t starter myproject

配置一下Buildout:

~/Projects/buildout$ vim buildout.cfg
[buildout]
parts =
    tools
    apps
develop = myproject

[tools]
recipe = zc.recipe.egg
interpreter = python
eggs =
    pyramid

[apps]
recipe = zc.recipe.egg
eggs = myproject

再次运行一下buildout:

~/Projects/buildout$ bin/buildout

现在可以再buildout的环境中启动myproject了:

~/Projects/buildout$ bin/pserve myproject/development.ini
Starting server in PID 40619.
serving on http://0.0.0.0:6543

最佳实践/Tips

1. 固化egg的版本

把所有的版本信息写到[versions]里面:

extends = versions.cfg
versions = versions
show-picked-versions = true

配置中的“show-picked-versions = true “会在运行buildout的时候把所有的版本打印出来, 把它写到"versions.cfg"中就可以固化了:

[versions]
Chameleon = 2.11
Mako = 0.7.3
MarkupSafe = 0.15
PasteDeploy = 1.5.0
WebOb = 1.2.3
distribute = 0.6.35
repoze.lru = 0.6
translationstring = 1.1
venusian = 1.0a7
zc.buildout = 2.0.1
zc.recipe.egg = 2.0.0a3
zope.deprecation = 4.0.2
zope.interface = 4.0.5

# Required by:
# pyramid-debugtoolbar==1.0.4
Pygments = 1.6

# Required by:
# myproject==0.0
pyramid = 1.4

# Required by:
# myproject==0.0
pyramid-debugtoolbar = 1.0.4

# Required by:
# myproject==0.0
waitress = 0.8.2

2. 使用mr.developer插件来组织大型的项目, 让开发更方便

[buildout]
...
extensions = mr.developer

3. 开发环境 VS 生产环境

我们可以创建多个配置文件, 比如把buildout.cfg作为生产环境的配置, 把develop的配置从buildout.cfg删除, 创建一个development.cfg作为开发环境的配置:

[buildout]
extends = buildout.cfg
develop = myproject

升级Buildout到2.0版本

Tuesday, March 12, 2013

Buildout已经升级到2.0了, 刚刚升级了一下, 发现一些地方要注意.

  • 我们先要替换掉原来的bootstrap.py脚本, 下载新的2.0的bootstrap: http://downloads.buildout.org/2/bootstrap.py.

  • 新版本的buildout不再支持“buildout-versions” 和 “buildout.dumppickedversions“, 这个插件的功能已经内置了, 把show-picked-versions = true加到配置文件里面就行了.

    [buildout]
    ...
    show-picked-versions = true
    ...
    

推荐一个Python的异步的BDD框架-pyVows

Friday, November 9, 2012

pyVows, 这一个异步的BDD测试框架

想象我们正在测试一个加法函数:

def test_sum_returns_42():
    result = add_two_numbers(41, 1)

    assert result
    assert int(result)
    assert result == 42

尽管在这样一个非常简单的场景中, 我们有三个断言在这个测试中, 这样不太好, 我们想要每个测试一个断言, 所以我们可以这样:

def test_sum_returns_result():
    result = add_two_numbers(41, 1)
    assert result

def test_sum_returns_a_number():
    result = add_two_numbers(41, 1)
    assert int(result)

def test_sum_returns_42():
    result = add_two_numbers(41, 1)
    assert result == 42

除了add_two_numbers 这个函数被执行了三次, 一切OK. 当然在这么简单的测试中, 一个函数被执行多次也没关系, 但在真实的项目中, 我们应该减少调用次数, 这样我们的测试才能跑的更快。

我们可以用pyVows做如下的改进:

class SumContext(Vows.Context):

    def topic(self):
        return add_two_numbers(41, 1)

    def we_get_a_result(self, topic):
        expect(topic).Not.to_be_null()

    def we_get_a_number(self, topic):
        expect(topic).to_be_numeric()

    def we_get_42(self, topic):
        expect(topic).to_equal(42)

如果没看懂没关系, 我们再来看看下面这个例子

我们来做除零测试:

# division_by_zero_vows.py

from pyvows import Vows, expect

# Create a Test Batch
@Vows.batch
class Divisions(Vows.Context):
    class WhenDividingANumberByZero(Vows.Context):
        def topic(self):
            return 42 / 0

        def we_get_division_by_zero_error(self, topic):
            expect(topic).to_be_an_error_like(ZeroDivisionError)

    class WhenDividingByOne(Vows.Context):
        def topic(self):
            return 42 / 1

        def we_get_the_same_number(self, topic):
            expect(topic).to_equal(42)

我们来执行一下:

 $ pyvows division_by_zero_vows.py

 ============
 Vows Results
 ============

   OK » 2 honored  0 broken (0.000756s)

现在我们来看一个更为复杂一点的例子, 假设我们有一个水果对象模块叫the_good_things:

class Strawberry(object):
    def __init__(self):
        self.color = '#ff0000';

    def isTasty(self):
        return True

class PeeledBanana(object): pass

class Banana(object):
    def __init__(self):
        self.color = '#fff333';

    def peel(self):
        return PeeledBanana()

现在我们来写一些测试在 the_good_things_vows.py:

from pyvows import Vows, expect
from the_good_things import Strawberry, Banana, PeeledBanana

@Vows.batch
class TheGoodThings(Vows.Context):
    class AStrawberry(Vows.Context):
        def topic(self):
            return Strawberry()

        def is_red(self, topic):
            expect(topic.color).to_equal('#ff0000')

        def and_tasty(self, topic):
            expect(topic.isTasty()).to_be_true()

    class ABanana(Vows.Context):
        def topic(self):
            return Banana()

        class WhenPeeled(Vows.Context):
            def topic(self, banana):
                return banana.peel()

            def returns_a_peeled_banana(self, topic):
                expect(topic).to_be_instance_of(PeeledBanana)

我们来运行一下这个测试:

$ pyvows the_good_things_vows.py

 ============
 Vows Results
 ============

   OK » 3 honored  0 broken (0.000863s)

更多特性和使用方法请阅读官方文档http://pyvows.org/

我常用的Mac软件

Friday, November 9, 2012

效率


Google Chrome浏览器
Sparrow, Mac上最好的邮件客户端, 没有之一
1Password, 密码管理
Evernote, 中文叫印象笔记,用来做笔记的,很棒
Skitch, 截图工具
iPhoto, 照片管理

预览工具, 很好用, 读PDF之类的文档用它就够了
TotalFinder 这个插件是完全可以让你的 Finder 强大到爆的一个插件
The Unarchiver, 解压缩工具
Alfred, 替代Spotlight
MPlayerX, 视频播放器
AppCleaner, 删软件用的
QQ
Skype
Adium, IM客户端, 支持Gtalk, MSN等
Twitter Client for Mac
Macbo, 微博客户端
Mindnode Pro, 用来画思维导图的工具
Microsoft Word
Microsoft Excel
Microsoft PowerPoint
Keynote
Dropbox
RSS阅读器

开发相关


iTerm2
Xcode
Dash,API文档阅读器

Transmit, FTP客户端
Pencil, 原型制作软件
Mou, Markdown可视化编辑器
Sublime Text 2, 代码编辑器
Fireworks, 图片处理
Sequel Pro, MySQL客户端

还有

  • QQ拼音输入法
  • oh-my-zsh 我的shell环境
  • MacVim
  • Textmate 编辑器
  • Toast Titanium 光盘刻录
  • MesaSQLite SQLite 客户端
  • Magican 系统清理

推荐一个简单好用的SVG绘图库pygal

Thursday, November 8, 2012

pygal, 是一个Python的SVG绘图lib, 可以很方便的用来做数据可视化, 也很容易集成到项目当中来。

先来看个例子:

看看代码就这么几行

>>> import pygal                                                       
>>> bar_chart = pygal.Bar()  
>>> bar_chart.add('Fibonacci', [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55])
>>> bar_chart.add('Padovan', [1, 1, 1, 2, 2, 3, 4, 5, 7, 9, 12])
>>> # 保存到文件
>>> bar_chart.render_to_file('bar_chart.svg')
>>> # 它还有个render_in_browser的方法, 直接输出到一个html文件,并在浏览器中显示
>>> bar_chart.render_in_browser()
file:///var/folders/47/zl40dfr57mddjn20xvwtz67m0000gn/T/tmpU9mNa7.html

集成到项目中

我们可以把图形svg内容输出embed到网页上就可以了

例子 (in Pyramid Base Web Application):

view

from pyramid.response import Response
from pyramid.view import view_config
import pygal

@view_config(route_name='svg')
def get_svg(request):
    bar_chart = pygal.Bar(width=600, height=400)
    bar_chart.add('Fibonacci', [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55])
    bar_chart.add('Padovan', [1, 1, 1, 2, 2, 3, 4, 5, 7, 9, 12])
    return Response(body=bar_chart.render(), content_type='image/svg+xml')

route config

config.add_route('svg', '/svg')

embed into html

<embed src="{{ req.route_url('svg')}}" type="image/svg+xml" width="600" height="400" />

这样就可以动态的输出svg数据图形到页面上了

通过二维码一键安装iOS或Android应用

Monday, November 5, 2012

1, 在网站中建立一个链接,并通过设备浏览器的User-Agent来判断设备是iOS还是Android还是其他。

@view_config(route_name='app', renderer='app.html')
def index(request):
    ua = request.user_agent
    if ('iPhone' in ua) or ('iPod' in ua) or ('iPad' in ua):
        # 跳到AppStore应用地址或者items-services协议地址
        return HTTPFound('itms-services://?action=download-manifest&url=http://xxx.com/app/app.plist')
    elif ('Android' in ua):
        # 跳到Android应用商店应用地址
        return HTTPFound('https://play.google.com/xxxxx')
    else:
        return {}

2,用上面建立的链接地址做一个二维码

这样通过二维码扫一扫就可以下载iOS或者Android应用了。

用items-service协议通过网站发布iOS应用

Monday, November 5, 2012

苹果允许用itms-services协议来直接在iPhone/iPad上安装应用程序,我们可以直接生成该协议需要的相关文件,这样app在还没发到AppStore之前可以通过这种方式来安装。前提是设备要是越狱的。

app.plist文件内容

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
   <key>items</key>
   <array>
       <dict>
           <key>assets</key>
           <array>
               <dict>
                   <key>kind</key>
                   <string>software-package</string>
                   <key>url</key>
                   <string>http://xxx.com/.../xxx.ipa(ipa文件的url地址)</string>
               </dict>
               <dict>
                   <key>kind</key>
                   <string>display-image</string>
                   <key>needs-shine</key>
                   <true/>
                   <key>url</key>
                   <string>应用icon地址</string>
               </dict>
           <dict>
                   <key>kind</key>
                   <string>full-size-image</string>
                   <key>needs-shine</key>
                   <true/>
                   <key>url</key>
                   <string>应用大icon地址</string>
               </dict>
           </array><key>metadata</key>
           <dict>
               <key>bundle-identifier</key>
               <string>com.xxxx.xxx (应用的id, 要和ipa文件里的一样)</string>
               <key>bundle-version</key>
               <string>1.0.0</string>
               <key>kind</key>
               <string>software</string>
               <key>subtitle</key>
               <string>应用的名称</string>
               <key>title</key>
               <string>应用的名称</string>
           </dict>
       </dict>
   </array>
</dict>
</plist>

建立一个html页面

<a href="itms-services://?action=download-manifest&url=http://xxx.com/app/app.plist">越狱的iOS设备点此处安装最新版本</a>

用浏览器访问这个页面并点击就可以安装了

Guide To Tracking Multiple Subdomains In Google Analytics

Thursday, August 9, 2012

Copied From: http://www.ericmobley.net/guide-to-tracking-multiple-subdomains-in-google-analytics/

Tracking multiple subdomains is rather easy.

Viewing your traffic for each subdomain is a little trickier. If all you do is set up the code and do not create the profiles and filters as described here, you will have one Google Analytics account tracking all of your subdomains, and absolutely no way to know which subdomain to attribute the traffic to.

The Code

The analytics code that you place on each subdomain will be the same. See the code below.

<script type="text/javascript">

  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'UA-xxxxxxxxxxx-1']);
  _gaq.push(['_setDomainName', 'yoursite.com']);
  _gaq.push(['_trackPageview']);

  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();

</script>

Notice there is only one new line of code in this example.

_gaq.push(['_setDomainName', 'yoursite.com']);

You can also get this code if, in Google Analytics, you go to Settings -> Tracking Code -> One domain with multiple subdomains.

The Profiles

Do not create a new Google Analytics account for each subdomain. Sure, it would technically work. There is nothing in the world to stop you. But there is a better way.

Create one Google Analytics account, and then create a profile for each subdomain plus another profile that tracks all subdomains collectively. So if you have two subdomains (www and mobile) you will want three profiles total. One for www, one for mobile, and one to track both subdomains. This will take some time to set up, but will be well worth it in the end.

Once you have created each profile, it’s time to apply the filters.

Filtering Subdomain Traffic In Profiles

We need to apply a filter to ensure that we track only traffic for the profile’s designated subdomain. Go to Admin -> Profiles -> Filters -> New Filter and refer to the screenshot below.

Google Analytics Profile
Filter

It’s that easy. Applying this filter to your profile will ensure that this profile only tracks traffic for the specified subdomain. In this case, mobile.yoursite.com.

Main Profile

When tracking multiple sub domains in one profile, you will not be able to differentiate between your subdomains in your pages list in Google Analytics unless you create a filter.

To illustrate this, go to your page list in analytics, Content -> Site Content -> Pages. You can’t see the hostname at all! See screenshot below.

image

By default, Google Analytics does not show the hostname or subdomain in your reports. You will not be able to see which home page the back slash above refers to (/). It could refer to www or it could refer to mobile, there is no way of knowing. That’s why we need a filter to apply to this profile.

Filtering Main Profile

In your main profile for google analytics, go to create filter. Refer to the screenshot below to see how to apply filter.

Hostname
Filter

After applying this filter, the subdomain should appear in your page list, and you will be able to differentiate between traffic for each subdomain.

Wrap Up

In this guide, you should have learned how to 1) install Google Analytics code for subdomains 2) filter traffic to ensure that profiles track the traffic for it’s designated subdomain and 3) filter traffic in your main profile to display the subdomain.

9 Steps to a High-Converting Landing Page

Wednesday, August 1, 2012

Copied From: http://www.onboardly.com/customer-acquisition/9-steps-to-a-high-converting-landing-page/

image

Handing someone you just met at a networking event a piece of scrap paper with your details scrawled on it won’t get you too far. Doing this is much like promoting your ill-constructed landing page. More often than not, your landing page is a visitor’s first impression of your product, and your best chance to convert that visitor into a customer.

You don’t need to be a rockstar designer to create a beautiful, high-converting landing page. Follow this checklist and you will be well on your way to a rapidly growing customer base.

1. Keep It Simple

Every bit of information about your brand or product does not need to appear on your landing page. A landing page has one goal: lead capture. If you bombard your users with too many quotes, pictures and text, all you’ll end up with is a higher bounce rate. Instead, creating simple and easy-to-digest sections will have a much more positive impact on new visitors.

Suprpod keeps the text to a minimum on their landing page. Offering up simple graphics to explain exactly what the platform does at each stage, visitors can quickly absorb, understand and evaluate the startup.

image

2. Use Smart Graphics

Graphics on your landing page are like attention-seekers at parties. They are the first to get noticed, and people either love them or hate them. Using a cheesy stock image for your landing page is a total party foul.

We recommend using a screenshot of your app or a professional photo of your product. Whatever you use, make sure it’s authentically you. If you’re a startup with a brand new concept, it would help to turn your product description into a simple graphic (e.g. the three graphics in the Suprpod example above).

The Gijit landing page has one commanding image: the product. It makes the page clean, simple and informative.

image

3. Be Credible

If you’re a new startup, visitors will love the fact that you’ve been featured by TechCrunch, Mashable, CNN – whatever. If you’ve partnered with Amazon, Dropbox, Twitter or any other well-established brand, shout it from the rooftops. Adding these accomplishments to your landing page footer will make your visitors feel comfortable signing up or purchasing. It’s an easy way to convert early!

4. Use Fewer Input Fields

Less is more in your visitors’ eyes, especially regarding how much information they have to give. The more “required” fields you include on your registration page, the less likely visitors are to give you anything at all. You want to remove every obstacle possible between the initial visit and the conversion.

Instead of asking for first name, last name, date of birth, address, phone number, email address and mother’s maiden name, start with just an email address. You can collect the rest after conversion. Using email notices, drip marketing campaigns and incentives, you can collect everything you need later. Once you have the initial lead, you have a method of later contact.

Imperva has asked for everything and the kitchen sink on their landing page. In reality, all they need is a name and an email address (industry or business name might be nice too).

image

On the other hand, Zipongo only requires an email address and zip code, the minimum amount of information they need to provide valuable deals to customers.

image

5. Make Registering Irresistible

It’s too easy for new visitors to bounce from your page. If you don’t have a great call to action, they will. If your page has so much real estate that visitors have to scroll to view it all, include more than one call. You need to give visitors that push to commit and enter the customer acquisition funnel.

Use actionable words such as donate, download, create, call, buy, register, request and subscribe to encourage conversions. Here’s a trick to test just how good your call to action really is.

A. Stand six feet back from your screen and look at it. What do you see? What element stands out the most? It should be your CTA.

B. Sitting at a normal distance from your screen, tilt your head sideways and slightly squint your eyes. Again, your call to action should stand out the most.

6. Offer Something

As awesome as your landing page and brand is, offering a little something extra for new registrations will often seal the deal. Something simple like giving the first hundred people a discount, an eBook or early access will maximize those conversions.

7. A/B Test

Unless you’ve done a ton of research, you won’t know for sure what font, colors or copy lead to the most conversions, but A/B testing will tell you. Landing page specialists like Unbounce let you cleanly test your page. A/B testing involves creating two versions of your web page: an A and a B.

Avoid using multivariate testing. Multivariate involves testing many different elements at the same time (i.e. B has an alternate color palette, different graphics and different copy). By doing this, you’ll be unable to isolate which specific elements are most effective. Performing simple and clean A/B testing will help you create the best possible landing page.

For example, Manpacks A/B tested their landing page to determine what brand messaging results in the most conversions.

image

image

8. Be Social

Creating a network through social sharing is the easiest way to get exponential leads. That said, having a button for every social network available is overkill. KISSmetrics gives visitors the ability to share through Twitter, Facebook and Google+. Mashable adds LinkedIn to that lineup. Figure out what networks your audience is using the most and focus on those – the rest is just noise.

Using a tool like ClickToTweet allows you to create a link that shares predetermined text via social media. The rule of thumb is to keep the message short and sweet, especially on Twitter where you should leave extra characters for retweets.

9. Create Excitement Through Copy

None of the above will be worth anything if your copy reads like a children’s book. Don’t ignore your copy because it is often what visitors will base their opinions on. Keep it short and sweet: state the problem, your solution and a call to action. All other information is secondary.

It’s worth it to consider hiring a professional writer for your landing page. The cost will pale in comparison to a stronger conversion rate. Plus, writers know how to turn your 1500 word article into 140 characters. Try services like Scripted and Elance to find top quality writers.

Optimizing your landing page will ensure that it is not your last point of contact with potential customers. This checklist will undoubtedly help you create a high-converting landing page in no time. When in doubt, always test your hypothesis. Landing page development is a science!

10 reasons why I switched to Spine.js

Wednesday, August 1, 2012

Copied From: http://destroytoday.com/blog/reasons-for-spinejs/

In the past year, I shifted interests from the desktop to the web. I’m really drawn to apps that can be accessed from any device with a browser. I have a history with HTML, CSS, Flash and PHP, so I’m familiar with the space, but only in a presentation sense—I’ve made websites, but not web apps. I dove head-first into Rails and instantly fell in love, but the immediate response I knew with Flash was replaced with page loads. Because of this, I turned to Javascript.

Like the new kid at school, I didn’t know who was who in regard to frameworks. I looked around and saw mentions of Backbone.js everywhere, so I assumed it was the standard. After several months, however, I realized it’s not for me—Backbone.js lacks a clear direction of use. Every tutorial I read used a different structure, and it almost seemed too easy to disregard proven design patterns.

Enter Spine.js. I spent a night just reading through its guides and examining its demo apps. Everything I saw just looked right. That night, I wore a big smile and even had trouble sleeping because I couldn’t wait to start using it. What made me so excited?—these 10 things:

  1. A Clear Architecture

    mvc Spine.js follows MVC (model-view-controller, for those who should take a moment to learn MVC). All the apps I’ve written follow the MVC architecture, so I immediately know how to structure my app using Spine.js. I also feel a sense of familiarity off the bat. There’s no question of which class does what or where each class lives.

  2. Models are Models

    models Backbone.js has models, but it’s awkward because there are also collections—essentially an array of models that can also query an API and populate itself with the results. Spine.js models are very similar to Rails models. A model can be instantiated to represent a record, but it also has class-level methods for retrieving records from the API. These methods return the results instead of populating an array, so we don’t have to contemplate where the class lives, as one would with collections. And because collections are instances, many of the examples I’ve seen treat them as singletons. As a result, those learning Backbone.js and following these examples are also learning how to write untestable code.

  3. Spine.app

    generators While using Backbone.js, I found myself copy/pasting code every time I created a new class. I missed the generators I grew accustomed to in Rails. With a single command, I could generate the new class along with its spec, based on a template—this adds years to a dev’s life. “Write Backbone.js generators” was on my todo list for weeks, but I never got around to it.

    Spine.app generates files. With a single line, I can create a class and its spec, just like in Rails. Hell, I can even generate a new app with one command.

  4. Dynamic Records

    dynamic
records This one is just crazy black magic, but it solves a problem I faced with Backbone.js. Let’s say you fetched a record in one view of the app. Then you fetch and update that same record in a different view. In Spine.js, both records will update. You don’t have to worry about keeping them in sync. The moment I read about this, a single tear rolled down my cheek.

  5. Elements Hash

    elements
hash With Backbone.js, I constantly found myself manually assigning variables to nested elements in every view’s render method, repeating the same code for each element—that’s a lot of boilerplate. In Spine.js, there’s an ‘elements’ hash. The keys are selectors and the values are variable names. Just like the ‘events’ hash in Backbone.js, all your elements are mapped—clearly and easily.

  6. The Release Method

    release In my Flash days, optimization was a key to survival. If ever I forgot to remove a single event listener, my app would leak memory like… a poorly maintained app. Because of this, every class I wrote included a method to nullify all references and remove all event listeners. Spine.js has this built in. Sold.

  7. Routing Lives in the Controller

    routing There is no Router class in Spine.js. This functionality is part of the Controller class where it belongs. In any controller, I can navigate to a new location and react to this new location. Other controllers can react to this new location as well. Now there’s no temptation to create a router singleton.

  8. Model Adapters

    adapters By default, Spine.js saves models in memory, but there are two adapters that can be applied to any model class—Ajax and Local. By simply extending either of these adapters, your data can live in a remote database or even locally using HTML5’s local storage API. All this functionality is a matter of one line of code.

  9. Get a Model from its HTML Element

    model
reference This is another issue I faced with Backbone.js. I could instantiate a view and tie a model to it, but if I would ever need to reference that data without access to the view instance, I’d be out of luck. Spine.js provides access to an element’s model through a jQuery plugin. Just call the ‘data’ method on the element and you have your model.

  10. Logging

    logging Spine.js comes equipped with a nice little convenience module for logging. In any controller, you can call the log method and it will write to the console with a set prefix. You can then toggle whether or not to trace the logs without removing them.

In Conclusion

Now, this list is why I switched to Spine.js. Some apps might be better suited for Backbone.js, or any other JS framework. If you’re researching different frameworks, definitely take a look at all of them. Do your due dilligence. Make sure whichever framework you choose has a clear example and is free from gotchas. You don’t want to find yourself halfway through development and questioning the framework.

Resources

By default, Spine.app generates an app with Jasmine as its testing framework. I much prefer Mocha.js, so I forked Spine.app to add Mocha.js support. It also includes a HAML compiler and I have plans to include SASS as well as other helpers.

Here are all my javascript bookmarks from my recent high-dive into the language. They consist of links to articles, libraries, and answered StackOverflow questions. Hopefully, they will get a few of you out of a pickle.

I plan to write more about my Spine.js discoveries over the coming months, so keep an eye out if you’re interested.

A guide to analyzing Python performance

Sunday, July 29, 2012

Copied From: http://www.huyng.com/posts/python-performance-analysis/

While it’s not always the case that every Python program you write will require a rigorous performance analysis, it is reassuring to know that there are a wide variety of tools in Python’s ecosystem that one can turn to when the time arises.

Analyzing a program’s performance boils down to answering 4 basic questions:

  1. How fast is it running?
  2. Where are the speed bottlenecks?
  3. How much memory is it using?
  4. Where is memory leaking?

Below, we’ll dive into the details of answering these questions using some awesome tools.

Coarse grain timing with time


Let’s begin by using a quick and dirty method of timing our code: the good old unix utility time.

$ time python yourprogram.py

real    0m1.028s
user    0m0.001s
sys     0m0.003s

The meaning between the three output measurements are detailed in this stackoverflow article, but in short

  • real - refers to the actual elasped time
  • user - refers to the amount of cpu time spent outside of kernel
  • sys - refers to the amount of cpu time spent inside kernel specific functions

You can get a sense of how many cpu cycles your program used up regardless of other programs running on the system by adding together the sys and user times.

If the sum of sys and user times is much less than real time, then you can guess that most your program’s performance issues are most likely related to IO waits.

Fine grain timing with a timing context manager


Our next technique involves direct instrumentation of the code to get access to finer grain timing information. Here’s a small snippet I’ve found invaluable for making ad-hoc timing measurements:

timer.py

import time

class Timer(object):
    def __init__(self, verbose=False):
        self.verbose = verbose

    def __enter__(self):
        self.start = time.time()
        return self

    def __exit__(self, *args):
        self.end = time.time()
        self.secs = self.end - self.start
        self.msecs = self.secs * 1000  # millisecs
        if self.verbose:
            print 'elapsed time: %f ms' % self.msecs

In order to use it, wrap blocks of code that you want to time with Python’s with keyword and this Timer context manager. It will take care of starting the timer when your code block begins execution and stopping the timer when your code block ends.

Here’s an example use of the snippet:

from timer import Timer
from redis import Redis
rdb = Redis()

with Timer() as t:
    rdb.lpush("foo", "bar")
print "=> elasped lpush: %s s" % t.secs

with Timer as t:
    rdb.lpop("foo")
print "=> elasped lpop: %s s" % t.secs

I’ll often log the outputs of these timers to a file in order to see how my program’s performance evolves over time.

Line-by-line timing and execution frequency with a profiler


Robert Kern has a nice project called line_profiler which I often use to see how fast and how often each line of code is running in my scripts.

To use it, you’ll need to install the python package via pip:

$ pip install line_profiler

Once installed you’ll have access to a new module called “line_profiler” as well as an executable script “kernprof.py”.

To use this tool, first modify your source code by decorating the function you want to measure with the @profile decorator. Don’t worry, you don’t have to import anyting in order to use this decorator. The kernprof.py script automatically injects it into your script’s runtime during execution.

primes.py

@profile
def primes(n): 
    if n==2:
        return [2]
    elif n<2:
        return []
    s=range(3,n+1,2)
    mroot = n ** 0.5
    half=(n+1)/2-1
    i=0
    m=3
    while m <= mroot:
        if s[i]:
            j=(m*m-3)/2
            s[j]=0
            while j<half:
                s[j]=0
                j+=m
        i=i+1
        m=2*i+3
    return [2]+[x for x in s if x]
primes(100)

Once you’ve gotten your code setup with the @profile decorator, use kernprof.py to run your script.

$ kernprof.py -l -v fib.py

The -l option tells kernprof to inject the @profile decorator into your script’s builtins, and -v tells kernprof to display timing information once you’re script finishes. Here’s one the output should look like for the above script:

Wrote profile results to primes.py.lprof
Timer unit: 1e-06 s

File: primes.py
Function: primes at line 2
Total time: 0.00019 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     2                                           @profile
     3                                           def primes(n): 
     4         1            2      2.0      1.1      if n==2:
     5                                                   return [2]
     6         1            1      1.0      0.5      elif n<2:
     7                                                   return []
     8         1            4      4.0      2.1      s=range(3,n+1,2)
     9         1           10     10.0      5.3      mroot = n ** 0.5
    10         1            2      2.0      1.1      half=(n+1)/2-1
    11         1            1      1.0      0.5      i=0
    12         1            1      1.0      0.5      m=3
    13         5            7      1.4      3.7      while m <= mroot:
    14         4            4      1.0      2.1          if s[i]:
    15         3            4      1.3      2.1              j=(m*m-3)/2
    16         3            4      1.3      2.1              s[j]=0
    17        31           31      1.0     16.3              while j<half:
    18        28           28      1.0     14.7                  s[j]=0
    19        28           29      1.0     15.3                  j+=m
    20         4            4      1.0      2.1          i=i+1
    21         4            4      1.0      2.1          m=2*i+3
    22        50           54      1.1     28.4      return [2]+[x for x in s if x]

Look for lines with a high amount of hits or a high time interval. These are the areas where optimizations can yield the greatest improvements.

How much memory does it use?


Now that we have a good grasp on timing our code, let’s move on to figuring out how much memory our programs are using. Fortunately for us, Fabian Pedregosa has implemented a nice memory profiler modeled after Robert Kern’s line_profiler.

First install it via pip:

$ pip install -U memory_profiler
$ pip install psutil

(Installing the psutil package here is recommended because it greatly improves the performance of the memory_profiler).

Like line_profiler, memory_profiler requires that you decorate your function of interest with an @profile decorator like so:

@profile
def primes(n): 
    ...
    ...

To see how much memory your function uses run the following:

$ python -m memory_profiler primes.py

You should see output that looks like this once your program exits:

Filename: primes.py

Line #    Mem usage  Increment   Line Contents
==============================================
     2                           @profile
     3    7.9219 MB  0.0000 MB   def primes(n): 
     4    7.9219 MB  0.0000 MB       if n==2:
     5                                   return [2]
     6    7.9219 MB  0.0000 MB       elif n<2:
     7                                   return []
     8    7.9219 MB  0.0000 MB       s=range(3,n+1,2)
     9    7.9258 MB  0.0039 MB       mroot = n ** 0.5
    10    7.9258 MB  0.0000 MB       half=(n+1)/2-1
    11    7.9258 MB  0.0000 MB       i=0
    12    7.9258 MB  0.0000 MB       m=3
    13    7.9297 MB  0.0039 MB       while m <= mroot:
    14    7.9297 MB  0.0000 MB           if s[i]:
    15    7.9297 MB  0.0000 MB               j=(m*m-3)/2
    16    7.9258 MB -0.0039 MB               s[j]=0
    17    7.9297 MB  0.0039 MB               while j<half:
    18    7.9297 MB  0.0000 MB                   s[j]=0
    19    7.9297 MB  0.0000 MB                   j+=m
    20    7.9297 MB  0.0000 MB           i=i+1
    21    7.9297 MB  0.0000 MB           m=2*i+3
    22    7.9297 MB  0.0000 MB       return [2]+[x for x in s if x]

Where’s the memory leak?


The cPython interpreter uses reference counting as it’s main method of keeping track of memory. This means that every object contains a counter, which is incremented when a reference to the object is stored somewhere, and decremented when a reference to it is deleted. When the counter reaches zero, the cPython interpreter knows that the object is no longer in use so it deletes the object and deallocates the occupied memory.

A memory leak can often occur in your program if references to objects are held even though the object is no longer in use.

The quickest way to find these “memory leaks” is to use an awesome tool called objgraph written by Marius Gedminas. This tool allows you to see the number of objects in memory and also locate all the different places in your code that hold references to these objects.

To get started, first install objgraph:

$ pip install objgraph

Once you have this tool installed, insert into your code a statement to invoke the debugger:

import pdb; pdb.set_trace()
Which objects are the most common?

At run time, you can inspect the top 20 most prevalent objects in your program by running:

(pdb) import objgraph
(pdb) objgraph.show_most_common_types()

MyBigFatObject             20000
tuple                      16938
function                   4310
dict                       2790
wrapper_descriptor         1181
builtin_function_or_method 934
weakref                    764
list                       634
method_descriptor          507
getset_descriptor          451
type                       439
Which objects have been added or deleted?

We can also see which objects have been added or deleted between two points in time:

(pdb) import objgraph
(pdb) objgraph.show_growth()
.
.
.
(pdb) objgraph.show_growth()   # this only shows objects that has been added or deleted since last show_growth() call

traceback                4        +2
KeyboardInterrupt        1        +1
frame                   24        +1
list                   667        +1
tuple                16969        +1
What is referencing this leaky object?

Continuing down this route, we can also see where references to any given object is being held. Let’s take as an example the simple program below:

x = [1]
y = [x, [x], {"a":x}]
import pdb; pdb.set_trace()

To see what is holding a reference to the variable x, run the objgraph.show_backref() function:

(pdb) import objgraph
(pdb) objgraph.show_backref([x], filename="/tmp/backrefs.png")

The output of that command should be a PNG image stored at /tmp/backrefs.png and it should look something like this:

back refrences

The box at the bottom with red lettering is our object of interest. We can see that it’s referenced by the symbol x once and by the list y three times. If x is the object causing a memory leak, we can use this method to see why it’s not automatically being deallocated by tracking down all of its references.

So to review, objgraph allows us to:

  • show the top N objects occupying our python program’s memory
  • show what objects have been deleted or added over a period of time
  • show all references to a given object in our script

Effort vs precision


In this post, I’ve shown you how to use several tools to analyze a python program’s performance. Armed with these tools and techniques you should have all the information required to track down most memory leaks as well as identify speed bottlenecks in a Python program.

As with many other topics, running a performance analysis means balancing the tradeoffs between effort and precision. When in doubt, implement the simplest solution that will suit your current needs.

Refrences

在Linux系统中怎样不删除重要的文件

Sunday, February 12, 2012

Linux.conf.au 2012活动中介绍了一个很给力的工具safe-rm,
它重新封装了一下/bin/rm, 对Linux系统管理员很有用的,
用来保护一些重要的文件.

1,安装

apt-get install safe-rm

2,这时系统的一些重要目录就不会被删除

$ rm -rf /usr
Skipping /usr

3,通过配置/etc/safe-rm.conf~/.safe-rm 添加你的需要保护的路径或文件

Hidden tips 001

Tuesday, November 22, 2011

hidden tips

1, install ruby-1.9.3 on Mac OSX Lion via rvm

 rvm install 1.9.3 --with-gcc=clang

2, package name with underscore in buildout, should be replace underscore to DASH, for example pyramid_jinja2 in buildout.

 [versions]
 pyramid-jinja2 = 1.2

3, install proxychains on Mac OSX Lion via homebrew
download my Proxychains Formula, and run brew install proxychains

4, setting pyramid_debugtoolbar.
If the request’s REMOTE_ADDR is not 127.0.0.1, u should add config debugtoolbar.hosts in your .ini file, for example:

debugtoolbar.hosts = 127.0.0.1 192.168.0.116

Upgrade postgresql-8.4 to postgresql-9.1

Tuesday, November 22, 2011

install postgresql 9.1 on ubuntu via apt-get

1, back up your databases

 ~ pg_dumpall > outputfile

2, add postgresql apt repository

~ sudo add-apt-repository ppa:pitti/postgresql

3, remove postgresql-8.4

~ sudo apt-get remove postgresql-8.4

4, update apt source index

~ sudo apt-get update

5, install postgresql-9.1

~ sudo apt-get install postgresql-9.1

6, create new user for postgres

~ sudo -u postgres sh
[sudo] password for eric: 
$ createuser -P eric
Enter password for new role: 
Enter it again: 
Shall the new role be a superuser? (y/n) y
$ exit

7, restore your data from backup

~ psql -d postgres -f outputfile

Done!~

Mac OS X 下无密钥方式连接基于L2TP协议的VPN

Monday, November 7, 2011

需要连接一个L2TP协议的vpn, 填好信息竟然报错“IPSec 共享密钥”丢失。请验证您的设置并尝试重新连接。 但是这个vpn不需要IPSec 共享密钥啊, google了一把发现需要打补丁来绕过它。

/etc/ppp目录下新建一个文件options, 写入下面的内容

plugin L2TP.ppp
l2tpnoipsec

就可以无需密钥连接了,最后别忘了把高级设置里面"通过VPN连接发送所有流量"钩上。

© 2009-2013 lxneng.com. All rights reserved. Powered by Pyramid

go to Top