GSOC 2012 Proposal — “Useful tool to help bootstrap of new bindings”

The following is my GSOC2012 proposal and this link will be taken as a addtional info link which anyone who has interest can take a look at and make his comments.

Abstract

—————–

Shiboken is a front-end plugin for Generator Runner which can generate bindings for C++ libraries using CPython source code. But because of complicated language syntax and runtime mechanism of C++, it seems a little difficult for people who use Shiboken very often to create Python bindings, and there are already some complains in the Pyside mailing list. So, it’s very valuable and significant to implement a tool to help bootstrap of new bindings.

Introduction

——————

This tool would be a command line tool, and when using it, people could inform the header files of libraries which they want to bind, after running, it will generate basic typesystem files for these libraries plus the CMakeFiles for the project.

To let APIExtractor do what we want, we have to modify APIExtractor, because it does the hard work of parsing C++ headers, typesystem XML description files of the binding, and must use these two sets of information that the binding generator may work with. So, an ideal way would be to modify APIExtractor to take a given set of headers and from them produce an intermediate XML description file, which contains the information needed by our tool to generate final binding typesystem files.

However, in the process of modifying work, we can’t avoid dealing with C++, thus it may be not practical for us to do part work of modifying APIExtractor while creating the tool that uses its information at the sametime. So, in this summer project, preliminary work will be modifying APIExtractor first, and writing the command tool second.

The following is the concrete weekly plan:

Timeline

——————

Week 1: Get the clone of Pyside and Study the whole framework of APIExtractor through reading the source code.

Week 2-Week 8: Modify APIExtractor to do what we want. The detail is:

–Week 2: find the classes through parsing C++ headers

–Week 3: handle namespace-type, interface-type, object-type

–Week 4: extract primitive-type, enum-type,value-type

–Week 5: extract other complex typs like Template

–Week 6: handle C/C++ function and mark them if they need code injection or argument modification or not

–Week 7: can extract all exportable symbols from the given library

–Week 8: write these information to an intermediate XML file

Week 9-Week 12: Create the command tool to help bootstrap of new bindings and do test work

–Week 9: write Python script to offer function of generating new binding typesystem files from the intermediate file created before

–Week 10: write tests against such script

–Week 11: through times testing, check if the previous modifying work on APIExtractor behaves well or not

–Week 12: help improve our command tool in return

Week 13-Week14: perfect the code and documentations, and begin to submit patches

–Week 13: perfect the code and documentations

–Week 14: create patches and try to submit them

Week 15….: summarize the whole work and say goodbye to summer

About me

——————–

I’m an undergraduate from China and I’m preparing for my computer science and technology bachelor degree in USTB (University of Science and Technology Beijing). Java and Python is the main programming language when I write programs and software.

During my coding days, I’m getting to know more and more about the Open-Source/FOSS, especially its spirits-share, collaborate, make friends. Every time I find an excellent solution about some problems occurred to me, I become so excited and learned a lot from the process of problems finding-reporting-solving.

I joined gsoc2011 last year and took one Python idea as my summer project, you can take a look at what the project does via python wiki page http://wiki.python.org/moin/SummerOfCode/2011/SetuptoolsFeatures, and you also can know more through my blog for gsoc: http://higery.wordpress.com/

Contact Info:

——————–

# Name: Xu Dehai(You can call me with my English name-higery)

# Blog: http://higery.wordpress.com/

# IRC: higery on irc.freenode.net

Posted in Python | Leave a comment

Done and keep going!

It’s amazing that I have passed the final evaluation of the Google Summer of Code 2011 and I feels that it’s necessary for me to write someting for it. Looking back to the passed 3 months, I have learned a lot through joining this activity. Although this term ended, I think it’s still a long way for me to go to know more about the opensource and make contributions to it. In the following part, I want to write some tips which I summarize in this summer, possibly useful to future students, and also want to write something about what have gained from this term, which may help me do better and go further.

Google.Inc has offered a very good chance for students to join the opensource in a deep way, and the very most exciting part is the 5000$ money to motivate students to do best. What’s more, in this term, we can also make friends. So I think we can’t find a reason for us not to try our best to join this activity, however, not everyone applied will be accepted by Google, which means there are still some requirements for us. Now as a student finally experienced the whole process of GSOC, I think the following tips may be helpful to future students to be more succesful in the first step.

Tip1: Retrieve

Opensource is a big world, and there are many opensource groups/organizations joined the GSOC, so at a very beginning time, you should retrieve the projects which you may be familiared with or interested in. It’s difficult for one to sucessfully complete the project which he even doesn’t know the developing language it uses, so the candidate projects that you want to take as your GSOC projects should be among your familiar fields at some extent. If there some projects that you are very very interested in but you didn’t have much basic background knowledge of it, then you should prepare yourself for a long time before you decide to apply for it. All we have and should do is to fill the gap between the organizations and our students to let them know that we maybe the qualified candidate to do their projects well when communicating in future applying process.

Tip2: Concentrate

Google allows us to submit more than 10 proposals(and this number is seemingly greater and greater through years) to apply for a student ticket. It seems that we have a very big chance to be accepted because we can submit ‘many’ proposals – even though one is refused, another may be accepted. But I do not think so – First it may take a very long time to write a good proposal, and if you write more, then much more time will be spent. Second, we need to discuss the ideas among the corresponding organization group, thus we can’t guarantee we pay enough attention to all these projects or ideas. There is never not a project which has not been carefully discussed that will be accepted. The more is not always the best.

We can focus on only one project and prepare ourselves carefully for it, for example, try to know what this project does, what its belonging organization does, what a development process and tools it adopts, etc. It can help us better communicate with the developers or the idea owner, and we can understand the idea better from this communication, which in return help us get higher possiblity to be accepted. Especially, the possible mentor will know if you truly want to take this project and are qualified one or not from communicating. For me, in this summer, I just applied for the idea “fulfil the setuptools features for packaging” and tried to communicate with the Eric(my current mentor) frequently, and before that, I was not that familiar with distutils, especially the new packaging module, which with more and more communicating and reading corresponding documents, I understand the idea better and better, and can communicate with Eric more effectively with less misundertanding. You know, at last, I was accepted.

So, in my opinion, rather than focusing on more project ideas, just concentrate only one(or no more than two) ideas, and prepare carefully yourself for it, then you can get a better result.

(To be continue……)

Posted in GSOC2011 | Leave a comment

13# Weekly Report [2011/8/15 - 2011/8/21]

Tomorrow is the ‘pencil down’ date, so the most I did in this week is still fixing and improving my code and docs to make all of my contributions a reviewable version.

The detail is:
for develop command
1. enhance docs for develop
2. improve code basing on Mentor’s reviews against my patch
3. merge my developing branch with default and push all changesets to the remote repo
4. show usage of the develop command and listen to opinions from other guys on the bug tracker

for scripts features
1. add docs for automatic scripts generation funtionality
2. improve code basing on reviews
3. merge
4. show what I have done for this feature and what’s the problem I think current patch has, and join in the
discussing

Good luck to everyone!
And
Best wishes to my mentor, he did help me a lot.

Regards,
higery

Posted in Distutils2, GSOC2011, Python | Leave a comment

12# Weekly Report [2011/8/8 - 2011/8/14]

I was not familiar with how to write docs for python source code last week, so I sent an email to my mentor, then got to know that docs mean more than the comments in the source. From reading the tutorial on http://docs.python.org/dev/ and the guide of Sphnix, I become more and more clear about the documenting stuff of Python. Therefore, I know that what I should do in following days is hacking corresponding .rst files in the /Doc directory. In the mean time, mentor also gave good suggestion on what kind of files would be modified towards my two tasks. With days’ reading the .rst file and generated docs in html format with browser, I know more about the Sphnix stuff, like ‘reStructuredText directives’ conception and can take my docs work more easier.

Towards the ‘develop’ command work, what I did is:
1) hacking the Doc/install/install.rst file to show usage of the ‘develop’ command
2) hacking the Lib/packaing/install.py to make ‘develop’ command be a kind of install command.

I mainly added several functions here to support ‘develop’, but I want to talk more here: if you have read the source code of install.py, you can know that there are three different ways you install a project, or we say there are three versions – _run_packaging_install, _run_setuptools_install, _run_distutils_install . The ‘develop’ command is a kind of install command, but it also looks like a much different and stand-alone stuff. So, there are also more than one versions to run the develop command – in my hacking source code, they are _run_packaging_develop and _run_setuptools_develop (no distutils version here). What’s more, there are some other counterpart of old functions in install.py to support ‘develop’ command, for example, _run_develop_from_dir, _run_develop_from_archive, etc.

One big problem here is also the naming of entrance of ‘develop’. We know that there is an
‘install_local_project’ function as the entrance to issue the ‘install’ command to install a project, then how the name of the entrance of ‘develop’ looks like. While, there already have been a lot of discussing, even arguing in the mailing list about this question. From my point of view and understanding of what people have said in the ML, I temporarily name this entrance function ‘install_editable’.

Towards the scripts work, what I did is:
1) removing old-style scripts support in Lib/packaging/command/build_scripts.py and just retaining the wrapper scripts support
2) doing necessary removing work in Lib/packaging/tests/test_command_build_scripts.py and fixing work against after this change
3) enhancing the docs strings of ‘build_scripts’ command, because it only support new-style scripts now
4) hacking corresponding .rst files in /Doc to complete the docs of new-style scripts

There is also a problem I think should be considered: after this change, we may destroy the old semantics of ‘scripts’. For the setup.cfg configuration file, scripts is an item of the ‘files’ section, which means ‘a list of scripts the project includes *optional *multi’. However, our new-style scripts do not exist – they are just generated after running the build_scripts command. If we fill kind of ‘foo=a.b.c.main’ dotted path strings which mean wrapper entries for future generation in the ‘scripts’ line, it looks strange – is the ‘foo=a.b.c.main’ a file?

I have sent an email to my mentor and am waiting for his reply. This is also a question we have talked in an early time, but I found that it maybe a problem to work around it.

The left work may be some docs work against the scripts.

The ‘pencils down’ date is approching, and I’m doing my best and hoping good luck.

Best regards,
higery

Posted in Distutils2, GSOC2011, Python | Leave a comment

11# Weekly Report [2011/8/1 - 2011/8/7]

The main work I have done this week is fulfiling the features of ‘develop’ as much as possible, and improving the tests and comments. However, in a long time, I didn’t know what’s the differencies between the comments and docs, and I’m not sure if the comments in current status is enough. After having a short communication with mentor in email, I got to know that the docs in Python code constructed in a formatted documenting language – reStructuredText. I will try to rich the docs of my code as soon as possible, and make it a readable and reviewable version at least.

OK. Let’s come back to the work I have done this week:

1 ‘develop’ command:

1) add uninstall_link function to support ‘uninstall’ option

2) add test function against ‘uninstall’ option

3) improve code and comments(will-be docs)

2 automatic scripts generation:

1) use ‘-w’ or ‘-window’ flag to mark an entry is a kind of ‘gui_scripts’ entry point

2) add kind of ‘gui_scripts’ entry point support in packaging

Tomorrow I will sort the work of scripts generation and write a post to show the status of it. Future work will focus on the docs of ‘develop’ and scripts. Because I’m not that familiar with reStructuredText language before this post, previous plan shown in last weekly report will have a tiny adjustment. Anyway, it’s still very ok for me to complete all the work before the ‘pencils down’ date, so don’t worry about me.

Best regards,
higery

Posted in Distutils2, GSOC2011, Python | Leave a comment

status of ‘develop’

Features already implemented:

1) [main use]projects can be importable after installed with this command, without copying any files

2) ['install-dir' option] the installation directory can be specified if the default site-packages is not permitted to write, but the specified installing path should be on ‘sys.path’ first; it depends on the user of this command to ‘install’ the project to the default path or to a specified one

3) ['install-pth' option] the .pth file installation path can be specified, and it also depends on the user to install the .pth file to which directory. It’s useful when we have no writting rights to the default site-packages

4) ['distinfo-path' option] set a specified relative path to be used in the .distinfo-link file. This can be useful in circumstances where your installation directory is being shared by code running under multiple platforms (e.g. Mac and Windows) which have different absolute locations for the code under development, but the same relative locations with respect to the installation directory.

Features should be but not yet implemented:

1) ['uninstall' option] remove all the installed files(.pth and .distinfo-link file)

2) ['no-scripts' option] tell the command to install the wrapper scripts or not (because the work of adding automatic scripts generation have not yet been finished, so this option has not yet been implemented)

Short plan to perfect this command:

1) implement these two no-implemented features

2) perfect and normlize the docs and tests

Deadline:
2011/08/05

Posted in Distutils2, GSOC2011, Python | Leave a comment

10# Weekly Report [2011/7/25 - 2011/7/31]

With mentor’s good suggestion and help, the work of adding automatic script generation for packaging is being better and better now, but there are still many places to be improved. One big problem among them is how to put wrapper scripts entries into the ‘scripts’ options, because it may destory the symantics of ‘scripts’ which also would look strange if we do that.

The mainly work I have done in this work also is enhancing current work of script generation. One thing should noted here is that when I write tests I always neglect the side effects caused by my code. For instance, when testing the generated wrapper scripts are runnable or not, we should let the python environment know the path of corresponding modules, so I used ‘sys.path.append(source)’ to achieve it, the problem is that it dirtied current python environment which may affect other test cases’ running. Thus, the best thing we always should do is restoring changed environments after test. For this instance, ‘self.addCleanup(sys.path.remove, source)’ is used here to make sure that the source path is removed from ‘sys.path’ after current test ran.

Well, I’m still looking for a good solution to add the dotted path to support automatic script generation, which can avoid amount of modification to current build_scripts.py.

Now the end of the term is nearing, I should make a short plan here for a more or less perfect result at last.

1) 2011/08/02: sort the work of ‘develop’ command and write a post to show which have been done and which are not. This post should also focus on the docs and tests that need enhanced.

2) 2011/08/03 ~ 2011/08/05: Fulfil the docs and tests of ‘develop’ to a reviewable version.

3) 2011/08/06: sort the work of scripts generation also write a post to show the status and plan in these final days

4) 2011/08/07 ~ 2011/08/10: Fulfil the docs and tests of scripts to a reviewable version.

5) 2011/08/11 ~ : continous enhancing and improving towards ‘develop’ and scripts (even after this term)

Best regards,
higery

Posted in Distutils2, GSOC2011, Python | Leave a comment