Gentoo Forums

Posted: **Tue Aug 30, 2005 11:52 am**

I also don't know why don't portage developers use things like Psyco, optimizers that can increase Python apps's speed painlessly before trying any port to C++.

Posted: **Tue Aug 30, 2005 3:23 pm**

Alejandro Nova wrote:I also don't know why don't portage developers use things like Psyco, optimizers that can increase Python apps's speed painlessly before trying any port to C++.

Because Daniel Robbins was violently against a C++ version of Portage for reasons that some people thought were idiotic. One dev that I know of even left because of it. Daniel loved Portage, a little too much I think. I'd love to see Portage redone in C++, because the slowness of the Python scripts is jsut killing Gentoo. What's the point of ricing your systems CFLAGS just to have it run slower than a dead donkey because the scripts bring it to it's knees?

Posted: **Thu Sep 01, 2005 5:03 am**

i agree, portage is becoming too slow, but its not entirely pythons fault, its because portage just wasnt really designed to scale as much as its being asked to, the number if ebuilds have exploded over the past few years. recoding portage in python would be okay too.. but it should be entirely restructured..

instead of having all the ebuilds on your own box, they should be on a mirror.

the only thing you should have locally is a sqlite or some other small db with the metadata, the packagename, keywords, dependency information and stuff.. this way it would only be a database approximately the size of 10mb at most, and when you sync, you only get a diff from the last time you synced (this would work with some commit id, possibly like svn, where you update from revision <nr> of the metadata database..) and then it fetches ebuilds and eventual patches and init scripts when you emerge..

ofcourse you should have the ability to do overlays and stuff, and also download the ebuild, make changes and then emerge it..

but this way, syncing alot wouldnt hurt the gentoo mirrors so much, as they would roughly probably smash out 200 or 300kb of data for a sync between a day, instead of ALOT
and it would be smashingly fast, and also, it wouldnt take up so much space, which is, if you think about it quite stupid as it does it now

Posted: **Thu Sep 01, 2005 7:53 pm**

Redeeman wrote:i agree, portage is becoming too slow, but its not entirely pythons fault, its because portage just wasnt really designed to scale as much as its being asked to, the number if ebuilds have exploded over the past few years. recoding portage in python would be okay too.. but it should be entirely restructured..

instead of having all the ebuilds on your own box, they should be on a mirror.

the only thing you should have locally is a sqlite or some other small db with the metadata, the packagename, keywords, dependency information and stuff.. this way it would only be a database approximately the size of 10mb at most, and when you sync, you only get a diff from the last time you synced (this would work with some commit id, possibly like svn, where you update from revision <nr> of the metadata database..) and then it fetches ebuilds and eventual patches and init scripts when you emerge..

ofcourse you should have the ability to do overlays and stuff, and also download the ebuild, make changes and then emerge it..

but this way, syncing alot wouldnt hurt the gentoo mirrors so much, as they would roughly probably smash out 200 or 300kb of data for a sync between a day, instead of ALOT
and it would be smashingly fast, and also, it wouldnt take up so much space, which is, if you think about it quite stupid as it does it now

I like your ideas. Storing ebuild information in a database system would not only increase performance but would make it possible to use the database of our choice. This will make Portage (and Gentoo) easier to deploy on a larger scale by using a database server when multiple machines are using the same repository. It'll also be MUCH easier to handle reverse dependencies.

Though, I think that Portage should stay in Python. Gentoo (with Portage) is sort of the "Python-dist", there's also a bunch of shell-scripts that I think should be rewritten in Python.
One thing is for sure though, NOW is the time to rewrite Portage, if we let it grow much larger it'll be to big a project to rewrite it and it'll never be done.

That's my thoughts anyway...

Posted: **Fri Sep 09, 2005 2:57 pm**

Valheru wrote:
Alejandro Nova wrote:I also don't know why don't portage developers use things like Psyco, optimizers that can increase Python apps's speed painlessly before trying any port to C++.
Because Daniel Robbins was violently against a C++ version of Portage for reasons that some people thought were idiotic. One dev that I know of even left because of it. Daniel loved Portage, a little too much I think. I'd love to see Portage redone in C++, because the slowness of the Python scripts is jsut killing Gentoo. What's the point of ricing your systems CFLAGS just to have it run slower than a dead donkey because the scripts bring it to it's knees?

I agree with you, because i think a the basesystem of Gentoo should only contain a running GNU/LINUX-System (Glibc, Core-Utils, Bash, Editor, Kernel).
No Python, no Perl, no Ruby, no Java, no Mono no anything of this what RPM-User make scary *gg* - they are maybe nice for a small program or to learn programming, but there is no reason to include them in an Operating System. I never understand why Portage is programmed in Python, because Portage is not an small "quick an dirty" program, or something like that.

But the real reason why portage is so slow is another thing: a lot of small files
We need an Metadatafilesystem or something like that, not a MySQL backend or Reiser4, this are only bugfixes.

Posted: **Fri Sep 09, 2005 3:18 pm**

hoschi wrote: No Python, no Perl, no Ruby, no Java, no Mono no anything of this what RPM-User make scary *gg* - they are maybe nice for a small program or to learn programming, but there is no reason to include them in an Operating System. I never understand why Portage is programmed in Python, because Portage is not an small "quick an dirty" program, or something like that.

But the real reason why portage is so slow is another thing: a lot of small files
We need an Metadatafilesystem or something like that, not a MySQL backend or Reiser4, this are only bugfixes.

What is it about python, perl, ruby that make them usful only for small "dirty" programs?

The slowness of portage, as reported a lot of times in the past, comes from the massive amounts of IO most of the commands have to do.

I agree with putting portage onto a database backend, that should speed up the emerge sync(only needs to emerge one or two small files) and make the internet and even bigger dependancy than it already is for gento (though a change to emerge -f so it grabbed the ebuild and the source would sort that).

But exactly what parts of portage, can be made faster by re-writing them in c/c++? I would say any imporvment you can make doing that, can be made to look very very pathetic by simply replacing your IDE drives with 15krpm SCSI drives.

I don't mean to get at this project, I'm all for a version of portage in a language I can grok. I only do C/C++. Just please, don't try to tell me that scripted languages "have no place in an Operating System". Its plain bull.

Posted: **Sat Sep 10, 2005 3:37 pm**

Okay,
you must see, to solve dependencies via a software-managment is a good thing, but to have no dependencies is always better and simpler.
why i should install a big and "slow" dependencie like python, it blows up my system.
sure, portage with python is not slow, but it will be never so fast as and c/c++ app, on the other side: portage is not something really heavy thing like gnome-nautilus or inkscape

the reason why i don't like programs with dependencies like python or ruby is, that the are not slower than c/c++ apps (the are often small, no big speed difference), it is because i don't like fat ans slow systems, with many not necessary software on it.

if i can replace portage (python), with portage (c/c++, maybe too if it based on the bash) i will do it
u will say *what* a portage based on bash?
yes!
i dont like dependencies, and the bash is always installed on nearly every system...

Deps must die, Deps must die, Deps must die

Posted: **Sun Sep 11, 2005 9:55 pm**

What's this dependency phobia you are suffering of, I don't think it's rational at all. I can't understand either why adding Python to a base system would made it "fat and slow".

Apart from these things, I think we have some dangerous effort duplication here. sportage and portage-c might be among the best softwares around, but I don't think they're the right way to address current Portage's empasse. A nice refactoring of Portage is what is needed, first of all introducing database use (sqlite?), then I think that lots of speed issues would simply go away or be easily fixed.

Posted: **Mon Sep 12, 2005 4:32 am**

I think so too, but nevertheless I think the current Portage code (I looked a bit through it) has grown enormously in the past and is very difficult to understand and change now. Therefore it'll be a lot easier to reimplement portage from scratch. And if somebody does that I understand that he/she chooses c/c++ over python. Why not, when you have to code everything from scratch anyway...
I do believe that many many people want to see a portage without python dependency.

Posted: **Mon Sep 12, 2005 5:22 am**

tempest wrote:What's this dependency phobia you are suffering of, I don't think it's rational at all. I can't understand either why adding Python to a base system would made it "fat and slow".

After just completing my first Linux From Scratch, and being lucky enough to be succesful on my first try, I don't think his phobia is irrational at all. More dependencies add more complexities, and more points of possbile failures, to what is already a complex process. I'm with horschi on his minimalistic approach and depending on only what it takes to get a basic Linux up and running. Besides, using more universal system languages such as bash for scripting and C for programming lowers the barriers of entry for future developers which could reduce the workload of present ones.

Apart from these things, I think we have some dangerous effort duplication here. sportage and portage-c might be among the best softwares around, but I don't think they're the right way to address current Portage's empasse. A nice refactoring of Portage is what is needed, first of all introducing database use (sqlite?), then I think that lots of speed issues would simply go away or be easily fixed.

Duplication of efforts drive innovation, especially in the open source world where developers are free to reuse ideas between projects. Some people like to play with knobs and buttons to see what they do, others like to open the box and see how they do what they do and perhaps even make them do it better. It's the second sort who take to projects such portage-c and sportage.

Posted: **Tue Sep 13, 2005 10:12 am**

In this topic I read a lot of personal opinions that are senselessly promoted as absolute truths.

Before *even* thinking about reimplementing Portage, what about sitting for a moment and locate the actual problem(s)? Rewriting a software from scratch just because we hope this time it will come out better will lead to time loss rewriting a software that already works, stillness of standard Portage and a lot of bugs to fix in the new one as the development goes through its natural cycle.

Besides that, what are the rational and objective arguments against Python for Portage, apart reasonless dependency phobia? Why would C (!!!) or C++ be "better" than Python for Portage?

EzInKy: I'm happy that you managed to install your LFS, and I'm sorry that you had to handle complexity caused by dependencies. That's not a problem with Gentoo since Portage always came integrated with the distribution and it's a task that has to be performed only once.

EzInKy wrote:Besides, using more universal system languages such as bash for scripting and C for programming lowers the barriers of entry for future developers

Did you *really* mean what you wrote? Bash and C let a program have a *lower* barrier for future developers? Can you develop on this?

In closing, yes, duplication of efforts drive innovation, and in fact we have been enlightened that some parts of Portage can be reimplemented to let it run much faster, thanks to sportage. And that's precisely what I'm saying: let's start the work on Portage.

Posted: **Tue Sep 13, 2005 10:14 pm**

tempest wrote: EzInKy: I'm happy that you managed to install your LFS, and I'm sorry that you had to handle complexity caused by dependencies. That's not a problem with Gentoo since Portage always came integrated with the distribution and it's a task that has to be performed only once.

I've been using Gentoo as my main OS for three years now, and before that Debian. I understand the purpose of Portage and other package managers.

tempest wrote: Did you *really* mean what you wrote? Bash and C let a program have a *lower* barrier for future developers? Can you develop on this?

Probably not, but the fun is in trying. That's one of the reasons I'm playing with LFS.

Posted: **Tue Sep 13, 2005 10:30 pm**

EzInKy wrote:Probably not, but the fun is in trying.

I'm with you about this one, but it all comes down to one's definition of "fun". I don't think I would have much fun rewriting the whole Portage core in C, having to fight against infancy bugs and maybe see halfway that the real problem was in old portage's lack of DB structure, and that I could have written 90% less code and implemented 90% more features in half the time using Python or Ruby... Or realizing in the end that a better thing to do would have been locating the incriminated part of Portage and refactoring it.

Anyway, it would be nice to hear opinions from the only people that will make the decision in the end: Portage developers. Are you out there?

Posted: **Wed Sep 14, 2005 6:40 am**

let's try to do something creative instead of argumenting over personal opinions. I checked the size of /usr/portage first time today and I must say that 1.4G is a bit much.. Let's try to find the best (or most promising) solutions for a "future portage".

I too like the ideas that Redeeman pointed out earlier. Databases make handling ebuilds a lot faster and save space. Any other opinions on what the storage mechanism should be?

In my opinion C/C++ is well suited for this because:
- it's fast
- gcc is always installed (and g++)

On the other hand python/perl (perhaps bash. not much experience):
- is easier to learn and to use
- powerful and fast to develop on

With a good architecture we could easily use multiple languages. Not forgetting stuff like mono, which would make it even easier. Any suggestions for an architecture? Anybody with good experience doing DB-diagrams?

Posted: **Thu Sep 15, 2005 9:40 pm**

tempest wrote:
EzInKy wrote:Besides, using more universal system languages such as bash for scripting and C for programming lowers the barriers of entry for future developers
Did you *really* mean what you wrote? Bash and C let a program have a *lower* barrier for future developers? Can you develop on this?

Maybe he means this:

See, look at Gnome. The guys at Gnome try to support every language, every. Python, Ruby, C...
I think the Gnome-Project will be a little bit faster, if the needn't support every language.

Different programming languages for different jobs: C or C++, or Bash for small scripts and things like that.
Maybe Java for some web-based programms, which should run under every OS.
Now we can discuss: Do we need for everyone a speciall language, is this the solution?

If i take a look at Sonance (first written in C)

Deps: Mono, Mono-Dev, Gnome-Mono, Gstreamer-Mono *ouch*

Posted: **Sun Sep 18, 2005 5:48 pm**

hoschi, I really don't get your point, if any, so I'll try to make mine clear. When a project starts and a platform (language/architecture/whatever) for it has to be chosen, priority should to go to the platform who helps developers get the job done well and quickly, not to the one that minimizes dependencies on runtime environments (to what advantage, anyway?).

Posted: **Tue Sep 20, 2005 11:10 am**

tempest wrote:hoschi, I really don't get your point, if any, so I'll try to make mine clear. When a project starts and a platform (language/architecture/whatever) for it has to be chosen, priority should to go to the platform who helps developers get the job done well and quickly, not to the one that minimizes dependencies on runtime environments (to what advantage, anyway?).

One advantage of minimizing dependencies is it increases compatibility. Languages that compile directly to machine code and provide a standard set of functions are easier to port to different machines.

Posted: **Tue Sep 20, 2005 11:32 am**

That is not necessarily true.While coding in C or other low level language it is your responsibility to produce portable code, using portable libraries, supporting different compilers on different architecture. Using an interpreted language (or pseudo interpreted) your code is directly portable everywhere there is an interpreter (or VM).

Posted: **Tue Sep 20, 2005 12:59 pm**

Jeremy_Z wrote:That is not necessarily true.While coding in C or other low level language it is your responsibility to produce portable code, using portable libraries, supporting different compilers on different architecture. Using an interpreted language (or pseudo interpreted) your code is directly portable everywhere there is an interpreter (or VM).

Right, interpreted lanquages are portable only if the required interpretor has been ported first. Not a severe limitation for major platforms but does hinder development for embedded systems.

Posted: **Tue Sep 20, 2005 8:04 pm**

And to emphasize the problems with unnecessary dependencies I just ran into this bug while guiding a friend through a stage 1 install.

Posted: **Fri Sep 23, 2005 6:22 pm**

Jeremy_Z wrote:That is not necessarily true.While coding in C or other low level language it is your responsibility to produce portable code, using portable libraries, supporting different compilers on different architecture. Using an interpreted language (or pseudo interpreted) your code is directly portable everywhere there is an interpreter (or VM).

Right. Now, I'm not exactly a power user - but I've managed to break the runtime dependencies [edit]of portage[/edit] a number of times. What happens then is that I have to find out a way to fix them entirely on my own. There is not a tool I can use, so I usually end up installing libc++, python and some more stuff by hand.

Now, maybe I am a little bit reckless - but with a statically compiled portage in C/C++ I would at least have had tools.

Posted: **Fri Sep 23, 2005 6:52 pm**

EzInKy wrote:
tempest wrote:hoschi, I really don't get your point, if any, so I'll try to make mine clear. When a project starts and a platform (language/architecture/whatever) for it has to be chosen, priority should to go to the platform who helps developers get the job done well and quickly, not to the one that minimizes dependencies on runtime environments (to what advantage, anyway?).
One advantage of minimizing dependencies is it increases compatibility. Languages that compile directly to machine code and provide a standard set of functions are easier to port to different machines.

Yes! We can't offer every project his *own language*, this is stupid.
Look: Assembler (ok, this is not language itself), C, C++, .net/mono (C#), Bash (or any other shell), Ruby, Perl, Python, Java - you want more?

There is not the perfect language for every project, but a own language for every project is not the solution.
We can live good with Assembler, C/C++ and an small basic language like the Bash, or one language for the Web based apps (Java?).
But we can't live with ten different languages. You can help minorities (~some projects) with better development-enviroments, but you can't give every project a own programming-lanuage "because they want it".

Maybe we can replace Ruby/Python/Perl with Mono, but i don't think so

Posted: **Fri Sep 23, 2005 6:57 pm**

EzInKy wrote:And to emphasize the problems with unnecessary dependencies I just ran into this bug while guiding a friend through a stage 1 install.

But stage 1 sucks. Bob P's Stage 1/3 will give you all the optimilazation benefits without being so prone to small failures of that ilk, and you (or your friend) would have learned more anyway...

Reply here if you disagree; no sense hijakin this thread

Posted: **Fri Sep 23, 2005 6:58 pm**

Oh no, no discussion about the stages

Posted: **Sat Sep 24, 2005 2:55 pm**

The reason that portage is so slow is not because of python, but because it does not use a database. By using a database it would be a lot faster, something like sqlite, which is only 30k of C code and is a fully featured, server-less database system. That would solve most of our problems. Getting portage to use a database as the back-end would not only save us space but would also be much faster. I am not sure why it has not been done before.

Gentoo Forums

portage implementation discussion

let's stop this bickering