Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Why has portage become so slow
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
_______0
Guru
Guru


Joined: 15 Oct 2012
Posts: 521

PostPosted: Mon Oct 28, 2013 10:55 pm    Post subject: Reply with quote

TomWij wrote:
...


You start sounding like a giant troll across your posts. I am familiar with long standing nicks here and you just came out of the blue calling the shots and generally agravating experienced ppl with your faux expertise.

Either troll, or AI bot.
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Mon Oct 28, 2013 11:13 pm    Post subject: Reply with quote

_______0 wrote:
TomWij wrote:
...


You start sounding like a giant troll across your posts. I am familiar with long standing nicks here and you just came out of the blue calling the shots and generally agravating experienced ppl with your faux expertise.

Either troll, or AI bot.


Why?

You seem to have made a conception that assumes thoughts that do not seem to appear in reality, thus it is hard to believe what you say as it carries no weight; please back up your claims and stay relevant to the topic of this thread...
Back to top
View user's profile Send private message
steveL
Advocate
Advocate


Joined: 13 Sep 2006
Posts: 2164
Location: The Peanut Gallery

PostPosted: Tue Oct 29, 2013 4:37 am    Post subject: Reply with quote

TomWij wrote:
if you think 1% of the time, then no wonder that you end up 99% of the time with cleaning up the crap that was just the result of quick thoughts.

Just picking up on this: in fact I'm well-known for telling people on IRC that "thinking is 70% of the game" (can't mem who said that, could be Graham.) So you're wrong again.

As for quick thoughts, you should try commenting a little less on the forums imo, as many of your posts, particularly your long point-by-point rebuttals, appear to be the first thought that comes to your clearly-intelligent mind, that might make an argument. And you clearly love to argue.

As I stated elsewhere, I'd have to conclude you were a troll, if I did not know better.

And no the badge is not what makes the difference: there's many a "developer" who likes to troll, ime.

Feel free to do your usual point-by-point: just don't expect me to answer them all, as I don't have the headspace and to be blunt, most of them are usually you missing the point, afaic. But that's par for the course, really, when it comes to twenty-something males, especially geeks (which is not an insult, btw: I consider myself a geek, and I don't think of it as a negative term. A nerd now.. ;) Consider the word "ponder". (No, I do not want a discussion about it. I just want you to think about it for more than a day, in light of our recent exchanges, perhaps reviewing some of them as you do along with the phrase "missing the point". Up to you, ofc.)

And ffs stop telling people to stay to the topic; this is an informal discussion forum, and we are allowed to wander off-topic; if needed the mods will split it off, and they do not need you to tell them when or where that should happen, just like you do not need them to tell you how to write an ebuild.
Back to top
View user's profile Send private message
mv
Advocate
Advocate


Joined: 20 Apr 2005
Posts: 3803

PostPosted: Tue Oct 29, 2013 9:21 am    Post subject: Reply with quote

schorsch_76 wrote:
@mv: Most of this can be replaced by boost. Even for older gcc versions.

Things like the mentioned "auto" cannot. The pointer features yes, but you get the overhead anyway and then have a fixed dependency on boost instead of a more current compiler - not sure what is the lesser evil. IMHO boost as well as current C++11 compilers are nothing on which a "universal" tool like the package manager or eix should depend on. Concerning C++11, this may change in a few years...
Back to top
View user's profile Send private message
mv
Advocate
Advocate


Joined: 20 Apr 2005
Posts: 3803

PostPosted: Tue Oct 29, 2013 9:46 am    Post subject: Reply with quote

LoTeK wrote:
afaik are many package managers written in C or C++ (pacman, aptitude, etc), so why should it be much slower to develop?

They are by far less complex, since they only have to consider a few hardcoded dependencies. No useflags with all its complexity of use-dependencies, no handling of @revdep-rebuild/sub-slots etc. And the complex part of installation, sandboxing, well the whole handling of ebuilds and eclasses, is only replaced by an unpack function.
Quote:
For the code size: in C you can see the size, it's not hidden from you, but in python, C++ it is.

It is unclear what you mean here. For a higher level language C->C++98->C++11->python you have to write subsequently less code which is partially compensated by a speed loss (unless you are very careful and sometimes omit the higher-language features in C++98/C++11).
Quote:
Since C is that old and mature, aren't there way enough libraries for everything we need with portage?

No, there aren't. Moreover, using libraries alone makes C not simple: You cannot properly encapsulate things and thus get bullky code anyway. Moreover, you spend a lot of messy code for memory handling. It already a lot better with C++98, but then you have lengthy data types. And whatever you gain by automatic memory handling also usually means a certain speed loss.
Quote:
I can't comment on the implementation, but when you search a package with eix, you have your results in a blink of an eye, while with emerge -s pkg you wait, and wait and wait.

eix relies on a prebuilt database which has a format which is optimized for searching and does not care whether the data is up-to-date or correct. A fairer comparison would be "eix-update && eix pkg" (although this is also not completely fair, now in the opposite directiion).
Quote:
If I have to pick python or C++, I probably would go with python, since C++ is IMO "not C", but also not "high-level OO" like for example java.

The speed difference between python/java and C++ is enormous. C++ has all the advantages of C and simultaneously practically all possibilities of java. Yes, there is some speed loss compared to C, but I am speaking here of a factor of about 1.1-1.2, not about a factor of 5-10 or even more which is perhaps realistic for python (not so sure about java).
Back to top
View user's profile Send private message
schorsch_76
Tux's lil' helper
Tux's lil' helper


Joined: 19 Jun 2012
Posts: 136

PostPosted: Tue Oct 29, 2013 10:03 am    Post subject: Reply with quote

mv wrote:
schorsch_76 wrote:
@mv: Most of this can be replaced by boost. Even for older gcc versions.

Things like the mentioned "auto" cannot. The pointer features yes, but you get the overhead anyway and then have a fixed dependency on boost instead of a more current compiler - not sure what is the lesser evil. IMHO boost as well as current C++11 compilers are nothing on which a "universal" tool like the package manager or eix should depend on. Concerning C++11, this may change in a few years...


Auto is handled at compile time by the compiler. At runtime its simply the type which the compiler detected. auto is convient. Nothing more.

example:
Version with auto
Code:

typedef std::map<int, std::string> mymap;
mymap data;
auto pos = data.find(5);


"Classic" version:
Code:

typedef std::map<int, std::string> mymap;
mymap data;
mymap::iterator pos = data.find(5);


The smart pointers and stuff from boost is really fast. I use most of it under realtime conditions. operator->() is just one call. Nothing fancy. No perfomance hitter.
Back to top
View user's profile Send private message
franzf
Advocate
Advocate


Joined: 29 Mar 2005
Posts: 3840
Location: Irgendwo im Nirgendwo

PostPosted: Tue Oct 29, 2013 10:14 am    Post subject: Reply with quote

mv wrote:
Yes, there is some speed loss compared to C, but I am speaking here of a factor of about 1.1-1.2.

It depends on how you write your code. There was a quite large flamewar over at c-plusplus.de forums some time ago, where quite readable C++-code (readable for people who know about the features in use) ran faster than C code.
And those std::*ptr are not necessarily slower. Of course, if you don't care and always use shared_ptr, even if only scoped_ptr would be needed, you get a perfomance overhead.
If you care about pre-C++11 compilers, you can use std::tr1::*ptr as fallback, typedef is your friend.

Having said that I changed package manager from paludis (written completely in C++ + some bash) back to "slow" portage, because paludis did not really feel fast and I faced more and more issues. Finally I got fed up about Paludis' main dev telling Gentoo guys how to do packaging. I am happy now with portage-2.2.7 :) Even with "5-10x slower python" ;)
_________________
"der mac dennoch wesen geil"
Wolfram von Eschenbach, Parzival (Buch 1, Z. 7).
Ein frühes Statement gegen Windows.

My overlay
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Tue Oct 29, 2013 10:21 am    Post subject: Reply with quote

steveL wrote:
TomWij wrote:
if you think 1% of the time, then no wonder that you end up 99% of the time with cleaning up the crap that was just the result of quick thoughts.

Just picking up on this: in fact I'm well-known for telling people on IRC that "thinking is 70% of the game" (can't mem who said that, could be Graham.) So you're wrong again.

As for quick thoughts, you should try commenting a little less on the forums imo, as many of your posts, particularly your long point-by-point rebuttals, appear to be the first thought that comes to your clearly-intelligent mind, that might make an argument. And you clearly love to argue.

As I stated elsewhere, I'd have to conclude you were a troll, if I did not know better.


Why? You have just contradicted yourself. It further depends on how you define "troll"...

steveL wrote:
And no the badge is not what makes the difference: there's many a "developer" who likes to troll, ime.

Feel free to do your usual point-by-point: just don't expect me to answer them all, as I don't have the headspace and to be blunt, most of them are usually you missing the point, afaic. But that's par for the course, really, when it comes to twenty-something males, especially geeks (which is not an insult, btw: I consider myself a geek, and I don't think of it as a negative term. A nerd now.. ;) Consider the word "ponder". (No, I do not want a discussion about it. I just want you to think about it for more than a day, in light of our recent exchanges, perhaps reviewing some of them as you do along with the phrase "missing the point". Up to you, ofc.)

And ffs stop telling people to stay to the topic; this is an informal discussion forum, and we are allowed to wander off-topic; if needed the mods will split it off, and they do not need you to tell them when or where that should happen, just like you do not need them to tell you how to write an ebuild.


What? Just like that you do not need to tell me how to participate on the forums. If I want to report to restore the topic; so I will, because your post is not about Portage or its performance anymore...
Back to top
View user's profile Send private message
mv
Advocate
Advocate


Joined: 20 Apr 2005
Posts: 3803

PostPosted: Tue Oct 29, 2013 10:59 am    Post subject: Reply with quote

schorsch_76 wrote:
Auto is handled at compile time by the compiler. At runtime its simply the type which the compiler detected. auto is convient. Nothing more.

I didn't claim anything else. The main difference between python and C++ is convenience (and thus quicker writing/changing of code).
Well, C++ has the advantage of compile-time type checking which you partially lose with auto, opening a new field for bugs yet to be explored :wink: though I am sure it will be never as critical as in python (portage is already suffering severely rom the lack of compile time type checking... many runtime errors would have been avoided if one would have to declare types).
Quote:
example:

Saving a typedef (or more frequently avoiding repeating the code, since in practice you do not put everything which you use only twice into a typedef) is a considerable enhancement - from the frequency I used it in eix, I guess this alone would save 5-10% of development time, not to speak about readability of the code and simplification if a function type is changed in a more global manner (e.g. from a vector to a list).
Quote:
The smart pointers and stuff from boost is really fast. I use most of it under realtime conditions. operator->() is just one call. Nothing fancy. No perfomance hitter.

I also agree that it is not dramatic, but it needs some overhead (at least a handling of counters) in memory, code size and speed, which could often be avoided if you program on a lowe level. But I agree that this small overhead should normally not make you avoid the convenience (and - more important - safety from easily made and hard to find mistakes). The reasons it is not used in eix is not speed but less dependencies. However, it is hard to predict how big is the penalty in a big package manager; my guess is that it could make it 1-5% slower if used extensively in every tiny function just for convenience.
Back to top
View user's profile Send private message
_______0
Guru
Guru


Joined: 15 Oct 2012
Posts: 521

PostPosted: Tue Oct 29, 2013 12:10 pm    Post subject: Reply with quote

just curious, seen so many ppl reporting slowness in portage. I am not an expert but which version and how is it being slow?

here:

sys-apps/portage-2.1.12.2

Slow because results are not instant during Calculating dependencies...?

Here test with 35 packages took a couple of seconds, comparable to apt-get and zypper.

This is without having tweaked portage and it's associated components/parts (fs tweaks, hard drive, etc) for maximum speed.

Tried again with 2.2.7

Code:
real    0m37.008s
user    0m36.379s
sys     0m0.409s


No perceptible slowness.

The only time portage felt slow was when I added a trillion overlays. That drove me mad and didn't figure that part out until a did clean install without any overlay to realize that they were at fault.

And from irc conversation one time it was stated that python 3 it's slower due to UNICODE support.

Also for the sake of comparing portage is in fact coded in C++ -> paludis.

http://lwn.net/Articles/240399/

Would be interesting to see speed difference between the two.

I don't think it would be fair to compare portage to other package manager as Gentoo has to calculate way more with the USE flags taken into account.

In terms the packages I've tried, zypper, apt-get, pacman the latter is the fastest. The two former experience signicant delay during calculation step.

Windows 7 update is by far the slowest shit on earth, feels when you are in the restroom and the turd wont come out, and painful.

One part where Gentoo beats all of them hands down is with eix. I often wonder the brains behind eix, but I know the speed is due to using a database.

I wonder if same concept could be applied to portage. Why not put ALL package related info into a database? Why the need of calculating on the fly as gentoo does now?

mm... wait, that'd mean putting all combinatios possible ...

All in all portage isn't doing to bad, with the integration of revdep-rebuild recently. What's only left is integrating eix into portage and some USE flag info util such as euse.

Finally, I never run emerge world/system, because it never works cleanly :/
Back to top
View user's profile Send private message
Genone
Retired Dev
Retired Dev


Joined: 14 Mar 2003
Posts: 8995
Location: beyond the rim

PostPosted: Tue Oct 29, 2013 1:18 pm    Post subject: Reply with quote

TomWij wrote:
Genone wrote:
a) Rewrite in C isn't necessarily faster/better

With the focus on code of high performance; it can very well be, it is not uncommon to rewrite parts in C / C++ (eg. see pkgcore) to gain additional performance.

Of course it can be, and often is. Just saying that simply writing the same function in a different language doesn't make it magically faster. With higher-level languages you often have very optimized code hidden from you that isn't available when doing the same in a lower-level language (most obvious example: string handling in C).
Quote:
Genone wrote:
b) You'd have to reimplement a LOT of stuff (or add a ton of dependencies) that is available for free with python. This includes basic stuff like memory management, core datatypes like lists/maps/strings (or even unlimited integers) and goes up to high level features like XML handling and transparent compression.

While I can't comment about C; most of that is free in C++, and some of those high level features can be implemented in a single file. But well, there isn't much reason to be afraid form a small set of dependencies...

First, I was strictly talking about python->C. C++ includes quite some features in its standard library true, but compared to python you'd still need extra dependencies (working with C++ in my day job, and without Qt providing a lot of stuff I'd probably go nuts), and adds its own set of issues.
And dependencies are a major concern for a package manager, this is proven again and again whenever someone wants to update an old installation.

Quote:
Genone wrote:
c) Development and bugfixing would be much slower (more code to write for the same result, always recompile to test changes, no introspection facilities, ....)

It depends on how you write the code; I have not seen statistics or examples of how rewriting code in another language can lead to more code, it can very well be the opposite.

Ehm, you really want to argue that when writing the same non-trivial functionality (excluding special cases like interface wrappers) in python and C, the C code will not be more verbose by factor 5-10? Even with C++ and STL it's likely factor 2-4. If you really want a statistic for that you likely haven't seen much real-world code.
Quote:
Incremental recompilation as well as proper development practices makes recompilation easy; it shouldn't matter anyway, because the most time should go into design and the least into compilation.

Wether recompilation is easy or need very little time, it is an extra step you have to perform for every little change. True, the static checks of the compiler can save you from a large class of errors, but OTOH you often waste time hunting non-bugs caused by binaries not matching source. And don't start with "proper development practices", that's theorycrafting at best.
Quote:
C++ supports introspection facilities; but yes, C indeed doesn't seem to.

And for some reason every generic C++ framework implements it's own introspection facilities to avoid RTTI, and almost every book recommends against using reinterpret_cast ...

LoTek wrote:
But if the user expectations are "user-friendlyness" then there are an army of linux distributions that give them what they want.

Well, I never specifically said users expect "user-friendlyness" (as stupid as that sounds), as that term is a minefield to begin with. I was more referring to stuff like "I can set USE flags per package, why not CFLAGS", then later "I can set some env vars in package.env but not others" and "I can do this in /etc/portage but not in my profile". Or "revdep-rebuild can find broken libraries, why can't portage prevent them from breaking", then later "portage can rebuild broken libraries, but what about python/ruby/java/... modules?"

Things that are logical on some level, but significantly increase complexity over time. The other thing is that naturally users (in any audience) are more forgiving about errors/problems when something is new, but expect a more stable/polished product over time.
Quote:
But since the kernel and almost the whole UNIX environment are written in C, you can't avoid C. But if you have a package manager in python then you have to include a whole lot of code, just for a package manager, haven't you?
Why is it not better to have a clean environment that is as uniform as possible? I don't know what you mean with the "free stuff with python", but isn't the whole runtime, etc in the end written in C? For example when you create an object in a OO language, doesn't the language call "malloc" in the end?

In the end it's all converted to machine code one way or the other :P
And no, when you create an object the language doesn't necessarily call malloc 1:1, more likely it has allocated a larger memory block in advance and simply assigns addresses from that block internally. It's actually a good example how a naive C implementation can be slower/less efficient than using an optimized higher-level language. Just like using a SQL database will usually be faster than working directly on the filesystem level for many data-centric operations.
Quote:
Of course I can imagine the huge amount of work, but at least "in theory" it would be better to go this way, wouldn't it?

"Better" is a very subjective term. If you assume people come up with perfectly stable, portable, optimized and maintainable code, sure. But such a thing doesn't exist in reality.
What is better: having 10 MB of C code written/reviewed by maybe a handful of hobby-programmers, tested on a dozen system configurations at best before put into a public release, or a 30 MB codebase of C code written/reviewed by hundreds of professional programmers earning their living that way, tested on thousands of system configurations before going into a public release.
It's not just about the amount of work, but also about quality of work. Being good at dealing with graph problems doesn't mean you're good at writing parsers, IPC mechanics or memory management, no matter how much time you invest.
Back to top
View user's profile Send private message
XavierMiller
Moderator
Moderator


Joined: 23 Jul 2004
Posts: 5283
Location: ~Brussels - Belgique

PostPosted: Tue Oct 29, 2013 8:23 pm    Post subject: Reply with quote

_______0 wrote:
just curious, seen so many ppl reporting slowness in portage. I am not an expert but which version and how is it being slow?

here:

sys-apps/portage-2.1.12.2

Slow because results are not instant during Calculating dependencies...?

Here test with 35 packages took a couple of seconds, comparable to apt-get and zypper.

This is without having tweaked portage and it's associated components/parts (fs tweaks, hard drive, etc) for maximum speed.

Tried again with 2.2.7

Code:
real    0m37.008s
user    0m36.379s
sys     0m0.409s


No perceptible slowness.


Hello,

3 minutes on my machine. Is that fast ? ;)
_________________
Xavier Miller
(FR) Merci de respecter les règles du forum.
http://www.xaviermiller.be
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Tue Oct 29, 2013 9:11 pm    Post subject: Reply with quote

Genone wrote:
Of course it can be, and often is. Just saying that simply writing the same function in a different language doesn't make it magically faster. With higher-level languages you often have very optimized code hidden from you that isn't available when doing the same in a lower-level language (most obvious example: string handling in C).


This is due to paradigms, if you take the exact same function and implement it in the exact same way; you'll even might manage to discover how things in C++ might even be slower than in Python, it heavily depends on how you use the language and put all what it provides to you in good use as well as picking the right thing for the task. There are a tons of way to deal with a list in multiple languages; a simple proof that one could do is implement the exact same algorithm to work with a list in both C++ and Python and show how Python is slower; however, due to the nature of that people are not meant to program in Python that way this simple proof does not hold, it is no basis on which to say that Python is slower than C++.

Because if you go and use list comprehensions, you'll discover that it is around the speed of which C++ reaches; with the minor overhead of it being interpreted, though if you JIT (or however it is called in Python) it in advance that won't be a problem either. It's just ... with C++ you need to have a good look on how you effectively implement the algorithms in a proper way, whereas with Python the focus lies more on taking the benefit of its optimized code. If you try to find a hundred code examples on something for either language; you'll find that, in neither language everybody writes it in the most efficient way.

However, I still have the opinion that when you take into account performance from the start and consistently apply it everywhere and don't want to run into the limits of what Python provides you; that you might end up writing a program that reaches higher performance in C or C++. But hey, who am I to tell; I'm just a random student that has a background in C++ but doesn't know all the Python tricks and even regards Python as a prototyping language, I very well accept that you might have a very opposite opinion on this matter.

I simply am not confident with high performance code in Python; and part of that, might be why I have not yet managed to contribute a performance improvement to Portage. I've started my own PM some time ago; but it's progress has been early and dead for quite a while, simply because I'm not yet convinced that the performance problem we are meeting this very day is huge enough to be concerned. I can't make up my mind on it; try to improve Portage (but fail to find something because I'm not known with it), contribute to pkgcore (learninc curve, risk of slowing it down) or just continue to carefully write my own from the bottom up (one man army; which persons like John Camack [Doom,Quake], Donald Knuth [TeX], Woz [Apple], RMS [GCC,Emacs] and more have shown to work). But well; maybe, on the other hand, it is too early for me to take on the bigger tasks.

Genone wrote:
First, I was strictly talking about python->C. C++ includes quite some features in its standard library true, but compared to python you'd still need extra dependencies (working with C++ in my day job, and without Qt providing a lot of stuff I'd probably go nuts), and adds its own set of issues.
And dependencies are a major concern for a package manager, this is proven again and again whenever someone wants to update an old installation.


Boost contains quite some of what is needed; perhaps not all, but it isn't much. A bonus point is also that a lot of it are headers; so, you might not even need a working compiler to be able to install it. Dependencies shouldn't be too much of a major concern; what we kind of miss, are the necessary scripts to make it working again. Or well, perhaps those even exist; we might just not know of it, though I guess it takes an unexpected package manager failure for us to write such scripts ourselves, given all the other stuff that is on our list of things to be done...

Genone wrote:
Ehm, you really want to argue that when writing the same non-trivial functionality (excluding special cases like interface wrappers) in python and C, the C code will not be more verbose by factor 5-10? Even with C++ and STL it's likely factor 2-4. If you really want a statistic for that you likely haven't seen much real-world code.


That depends on how you write the code; I know a lot of people do it lengthy, but you can opt to not too. For example, by using <algorithm>.

Genone wrote:
Wether recompilation is easy or need very little time, it is an extra step you have to perform for every little change. True, the static checks of the compiler can save you from a large class of errors, but OTOH you often waste time hunting non-bugs caused by binaries not matching source. And don't start with "proper development practices", that's theorycrafting at best.


What you describe "binaries not matching source" is a problem with the build system and not compilation in particular; I have never experienced this before, so, I'm not sure how to even reproduce that.

And for everything else; well, those are red herrings that have to do with development practices, which I see that does make a difference when crafting it in practice.

Genone wrote:
And for some reason every generic C++ framework implements it's own introspection facilities to avoid RTTI, and almost every book recommends against using reinterpret_cast ...


If you find a better reason, feel free to share so; but from what I understand is that this has to do with cross platform performance, which does not hold up if you solely target Linux users.

A good read on this would be http://stackoverflow.com/a/4334421/47064 which goes quite into details; as for reinterpret_cast, that has to do with polymorphic classes.

Genone wrote:
Well, I never specifically said users expect "user-friendlyness" (as stupid as that sounds), as that term is a minefield to begin with. I was more referring to stuff like "I can set USE flags per package, why not CFLAGS", then later "I can set some env vars in package.env but not others" and "I can do this in /etc/portage but not in my profile". Or "revdep-rebuild can find broken libraries, why can't portage prevent them from breaking", then later "portage can rebuild broken libraries, but what about python/ruby/java/... modules?"

Things that are logical on some level, but significantly increase complexity over time. The other thing is that naturally users (in any audience) are more forgiving about errors/problems when something is new, but expect a more stable/polished product over time.


Yes, there is never an end to improving software; such requests like you describe are what makes it more preferable to work on a new package manager, as opposed to work on something that was not written with the intention to match what we want it to do today.

Genone wrote:
In the end it's all converted to machine code one way or the other :P
And no, when you create an object the language doesn't necessarily call malloc 1:1, more likely it has allocated a larger memory block in advance and simply assigns addresses from that block internally. It's actually a good example how a naive C implementation can be slower/less efficient than using an optimized higher-level language. Just like using a SQL database will usually be faster than working directly on the filesystem level for many data-centric operations.


It all really depends on how you write it; I'm going to take a step higher, good readings are http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html and http://www.computerworld.com/s/article/9243620/Big_data_needs_more_than_Hadoop_says_Facebook_exec which can really decide if you would use Hadoop or SQL. On a lower level; the same can apply to using a SQL database versus a filesystem, because at some point the amount of data becomes small enough which just makes the data become an overhead all over the place.

Genone wrote:
"Better" is a very subjective term. If you assume people come up with perfectly stable, portable, optimized and maintainable code, sure. But such a thing doesn't exist in reality.
What is better: having 10 MB of C code written/reviewed by maybe a handful of hobby-programmers, tested on a dozen system configurations at best before put into a public release, or a 30 MB codebase of C code written/reviewed by hundreds of professional programmers earning their living that way, tested on thousands of system configurations before going into a public release.
It's not just about the amount of work, but also about quality of work. Being good at dealing with graph problems doesn't mean you're good at writing parsers, IPC mechanics or memory management, no matter how much time you invest.


+1
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Tue Oct 29, 2013 9:12 pm    Post subject: Reply with quote

XavierMiller wrote:
3 minutes on my machine. Is that fast ? ;)


6 minutes on my machine. Yes, that is fast. (Just to throw in some pun)
Back to top
View user's profile Send private message
_______0
Guru
Guru


Joined: 15 Oct 2012
Posts: 521

PostPosted: Tue Oct 29, 2013 11:22 pm    Post subject: Reply with quote

XavierMiller wrote:

Hello,

3 minutes on my machine. Is that fast ? ;)


Nope, but how is that happening? You haven't pinned down the cause?

Which cpu and hard drive do you have?

There's gotta be an explanation for that.

Can you try with 35 packages only?
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Tue Oct 29, 2013 11:30 pm    Post subject: Reply with quote

_______0 wrote:
XavierMiller wrote:

Hello,

3 minutes on my machine. Is that fast ? ;)


Nope, but how is that happening? You haven't pinned down the cause?

Which cpu and hard drive do you have?

There's gotta be an explanation for that.

Can you try with 35 packages only?


Please read http://forums.gentoo.org/viewtopic-p-7408888.html#7408888 and the rest of this thread and note that I have over 1500 packages installed, from which 268 in @world and 44 in @system. This is a fast enough SSD with Intel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz which I deem not to be limiting factors on their own; I'm convinced the problem rather lies with complexity, because of the growing size of the Portage tree as well as the amount of possible dependencies. I'm not convinced that just 35 packages would be the test we aim for; while that might be something interesting for an embedded system, I think most people run much more packages on an install that they have at least for one year (including clean up efforts).
Back to top
View user's profile Send private message
steveL
Advocate
Advocate


Joined: 13 Sep 2006
Posts: 2164
Location: The Peanut Gallery

PostPosted: Wed Oct 30, 2013 7:44 am    Post subject: Reply with quote

TomWij wrote:
Quote:
And ffs stop telling people to stay to the topic; this is an informal discussion forum, and we are allowed to wander off-topic; if needed the mods will split it off, and they do not need you to tell them when or where that should happen, just like you do not need them to tell you how to write an ebuild.


What? Just like that you do not need to tell me how to participate on the forums. If I want to report to restore the topic; so I will, because your post is not about Portage or its performance anymore...

Feel free to do w/e you want: but continually throwing your weight around by telling people to stay on-topic etc, is just annoying, and in some quarters considered an aspect of trolling (as are constant point-by-point rebuttals while ignoring the actual point/s being made). By all means ask the mods to split off a discussion if it's gone off-topic, if you think they've missed something: just please stop telling us what to do. I do think your approach to the forums is off: you make it far too formal, and indeed threaten to report people, with snide references to other media that you do not know the context for. That's just my opinion, of course: just like every other post I make.

My post merely answered a point you made; I didn't think it worth getting into your statements requesting statistics about statement count differences between high and low-level languages. Genone pretty much answered it:
Quote:
If you really want a statistic for that you likely haven't seen much real-world code.
Unless you want to argue definitions again, in which case, do it with someone else.
With respect, I shan't be responding to you again on the forums for a good few weeks or months. Firstly it's too tedious, and secondly I don't want to get reported for trying to reason with you. Life's too short.

And yeah that's not about portage: but then nor is everything you've written. And I have every right to defend myself against what I see as baseless accusations, in the same place as they are made.
Back to top
View user's profile Send private message
ArneBab
Guru
Guru


Joined: 24 Jan 2006
Posts: 361
Location: Graben-Neudorf, Germany

PostPosted: Wed Oct 30, 2013 9:14 am    Post subject: Reply with quote

TomWij wrote:
There are not much alternatives. Paludis was shown to be slower in this thread, though I can't and won't confirm that. The other alternative is to wait for pkgcore to fully catch up with EAPI 5 (and soon EAPI 6), which was written to be more efficient but I have not yet tried out to see how much of that is true. Catching up with Portage is the main concern here; no use in running something, when it doesn't work most of the time or misses features that spare out a lot of time in other places.


Actually that necessary catchup work made me resent the new EAPIs: pkgcore worked wonderfully well. Then came the next EAPI and I had to use the much slower portage again.

I still use pquery nowadays (searching via pkgcore) and I would wish that emerge -s would just call the same functions as pquery in the background, so that the excellent searching component of pkgcore becomes part of the Gentoo core workflow (and as result of that stays available). I really would not want to use Gentoo without my trusty pix:

Code:
alias pix='pquery --raw -nv --attr=keywords --attr=license'


steveL wrote:
What I'm saying is, I wish Gentoo would call time on the portage codebase, put it into maintenance mode, and focus on pkgcore instead. The work's all been done, and it runs incredibly quickly in my experience; to the extent that about 5 years ago we added an "smerge" mode to update which uses the pkgcore resolver (ie pmerge) to get the list, and portage to build them (this was in the days before --jobs.) This was specifically to make maintenance of update a lot easier as well, since you tend to have to re-run the resolve a lot when you're hacking your front-end code to bits.


I did not know that - cool!

TomWij wrote:
Because to reimplement you need to start from the ground-up; why bother with pkgcore, joining forces will slow its progress down as per The Mythical Man-Month.


Because it The Mythical Man-Month only applies when you already have an established group working on it.

It’s great to hear that 2 people started hacking on pkgcore in July again, and since they are likely still working on really understanding the code, this would be the perfect time to throw coders at the problem.

So if you know any company which reaps enough profit off using Gentoo that they can hire Gentoo devs: This would be the perfect time for them to get active.
_________________
Being unpolitical means being political without realizing it. - Arne Babenhauserheide ( http://draketo.de )

pkgcore: So fast that it feels unreal - by doing only what is needed.
Back to top
View user's profile Send private message
steveL
Advocate
Advocate


Joined: 13 Sep 2006
Posts: 2164
Location: The Peanut Gallery

PostPosted: Wed Oct 30, 2013 9:45 am    Post subject: Reply with quote

ArneBab wrote:
I really would not want to use Gentoo without my trusty pix:
Code:
alias pix='pquery --raw -nv --attr=keywords --attr=license'

Heh, nice one. Been ages since I was able to use pkgcore as well.
Quote:
steveL wrote:
What I'm saying is, I wish Gentoo would call time on the portage codebase, put it into maintenance mode, and focus on pkgcore instead. The work's all been done, and it runs incredibly quickly in my experience; to the extent that about 5 years ago we added an "smerge" mode to update which uses the pkgcore resolver (ie pmerge) to get the list, and portage to build them


I did not know that - cool!

Yeah, it's always had the emerge=pmerge option as well ofc (in /etc/update and paludis, come to think of it, but no-one ever used that: I don't think that's surprising though, given the history;) Once we did that, we just had to do smerge, since there were sometimes problems building packages with pmerge, but the resolver was always spot on (and so fast.)
Quote:
It’s great to hear that 2 people started hacking on pkgcore in July again, and since they are likely still working on really understanding the code, this would be the perfect time to throw coders at the problem.

Yeah radhermit's been hacking away, and dol-sen has always done excellent python work (eg gentoolkit nextgen, which we're all now using.)
Quote:
So if you know any company which reaps enough profit off using Gentoo that they can hire Gentoo devs: This would be the perfect time for them to get active.

A company is far more likely to send in one of their own developers, so that they can use the expertise in-house for their own installations or project. I recently met one such on IRC, and it was really nice to chat with him (he'll hopefully contribute something back to the kernel-sources for binary packages, as that's what he's working on for his employer.)

Glad to see you're still about Arne.
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Wed Oct 30, 2013 10:35 am    Post subject: Reply with quote

steveL wrote:
Feel free to do w/e you want: but continually throwing your weight around by telling people to stay on-topic etc, is just annoying


People constantly going off-topic is annoying as well; action leads to reaction, and I wouldn't throw in any weight if people would use the proper place to address their off-topic matters and abide by the forum guidelines (which are meant to make this a better place).

steveL wrote:
By all means ask the mods to split off a discussion if it's gone off-topic, if you think they've missed something: just please stop telling us what to do. I do think your approach to the forums is off: you make it far too formal, and indeed threaten to report people, with snide references to other media that you do not know the context for. That's just my opinion, of course: just like every other post I make.

...

My post merely answered a point you made; I didn't think it worth getting into your statements requesting statistics about statement count differences between high and low-level languages.


It's fine if you do not want to get into it; but just like you told me before what to do I am going to tell you something similar to what you have told me: Please state that you don't want to, or don't reply, instead of using the forums in an informal off-topic way where a formal on-topic answer is expected. I am here for formal discussion, there is nothing wrong with wanting that; due to its formality it isn't disrespectful in any way, so, I still do not understand why it is perceived as such...

I almost fully agree with the last post you have made in the systemd headsup topic; apart from one sentence, where you "perceive this as #gentoo-chat" but I want this to be a more formal #gentoo-discuss *. Let's just agree to disagree here that we're not intending to use the forums in the same way; no matter the amount of words we're going to throw at it, it's not going to change either person's way of using this forum. I'm not going to spill any more words on this misunderstanding, not now and not in the future. Sorry and thank you.

To get back on topic; my point about statistics ("I have not seen statistics") is quite different from what you have understood. Because some of us will never agree on a trustworthy source, there can't be any statistics in the first place. My statement did not request that I want to have statistics, it rather meant to say that there are none that would fit this discussion (and you seem to agree, it indeed wouldn't be worth the waste of time to even try to obtain them).

(* #gentoo-discuss is a channel I have started months ago, to bring developers and users more together to discuss matters because there is no other IRC channel that really fits this purpose. But I have since closed that channel because it didn't attract much people yet because I didn't advertise it enough I think; I might bring it up again soon...)
Back to top
View user's profile Send private message
steveL
Advocate
Advocate


Joined: 13 Sep 2006
Posts: 2164
Location: The Peanut Gallery

PostPosted: Wed Oct 30, 2013 8:36 pm    Post subject: Reply with quote

TomWij wrote:
I wouldn't throw in any weight

Well at least you can admit that you have been throwing your weight around. Now concede that you are a forums-newb, please, despite your impressive post-count.
Quote:
It's fine if you do not want to get into it; but just like you told me before what to do I am going to tell you something similar to what you have told me: Please state that you don't want to, or don't reply, instead of using the forums in an informal off-topic way where a formal on-topic answer is expected. I am here for formal discussion, there is nothing wrong with wanting that; due to its formality it isn't disrespectful in any way, so, I still do not understand why it is perceived as such...

I told you why on IRC, and indeed here: people are here in their down-time, and they typically are more relaxed. If you keep telling everyone to stick to the topic every five minutes, then people will resent it; since it's just not how this works. We have moderators for that. Especially if you are also not taking time to consider what is being said to you, in the specific context, applied to the problem at hand. Simply put we get enough crap at work, and we don't want to have that feeling in our free-time.
Quote:
I almost fully agree with the last post you have made in the systemd headsup topic; apart from one sentence, where you "perceive this as #gentoo-chat" but I want this to be a more formal #gentoo-discuss *.

It does not matter what you want: first you have to adjust to your audience, and to the medium. For instance I would dearly love it if the Gentoo ML were exactly what you appear to want to make the forums into: a place to raise issues, and be told in no uncertain terms that posts with zero content, and not to the topic, are not allowed, and in fact where the weight you want to throw at users in a down-time forum, were instead thrown at developers posting snide comments and killing collaboration dead. That's what proctors were supposed to be about, until "developers" realised the implication: they'd be held to the same standards they throw at users, after they've insulted them for months on end.

So instead, after 6 months of the whole community, including all developers, deciding that proctors would be the way forward, suddenly it was "oh I don't think we really need proctors, after all we don't need to enforce all these rules." Yet the dev ML has much less of the atmosphere you claim to want for the forums, whereas the forums are supposed to be more informal.

It's appropriate for the official dev ML, which was deliberately setup as a place where devs would have to learn from their userbase, by stipulating a) that users, and indeed anyone, would always be allowed to post, and that b) Gentoo is here to serve its users.
It's not appropriate for the forums, imo, although courtesy and the basic respect of not attacking a person, but only critically analysing their statements, code, or work, are necessary everywhere that's not a private channel.

So in essence, you have it backwards afaic: you should apply exactly the same principles to the dev ML, not to our forums, which work very well already, and don't need you to "change" them into something they're not.
Quote:
To get back on topic; my point about statistics ("I have not seen statistics") is quite different from what you have understood. Because some of us will never agree on a trustworthy source, there can't be any statistics in the first place. My statement did not request that I want to have statistics, it rather meant to say that there are none that would fit this discussion (and you seem to agree, it indeed wouldn't be worth the waste of time to even try to obtain them).

And again, I am forced to wonder why you did not just say that, instead of asking questions you did not want an answer to. Or was that supposed to be rhetorical questioning of both me and Genone? Trust me, that kind of thing doesn't work well with a computer crowd, especially a mixed-language one, as the internet is: they'll either try to answer you, wasting time and getting frustrated when it turns out you are just making a point, but framing it as a question. Or they'll ignore you, or tell you why it's a dumb question. See how much quicker it would be, if you lost the debating-chamber flourishes, and simply made your point?

Quote:
#gentoo-discuss is a channel I have started months ago, to bring developers and users more together to discuss matters because there is no other IRC channel that really fits this purpose. But I have since closed that channel because it didn't attract much people yet because I didn't advertise it enough I think; I might bring it up again soon...)

I think you want the gentoo-project mailing-list for some of your discussion, and I would support a #gentoo-project equivalent IRC room, or indeed a forum of the same name.

But in essence, I'd recommend you apply your approach to the developer mailing-list (with much less rhetoric) and make that into a place where people can actually collaborate, and if people aren't interested in what's being discussed, they can STFU instead of posting like teenage fappers looking for validation.

Again, just my opinion of some of your fellow "developers." And they are in the minority; it's just funny how no developer ever calls out another developer on bad behaviour on the ML towards a user, but you are all so quick to reach for the formal process when you feel even a little bit under criticism. Fragile, insecure egos; we've all been there at some point or another. Geeks just tend to have very little self-insight, ime, by comparison to other sectors.
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Wed Oct 30, 2013 10:46 pm    Post subject: Reply with quote

steveL wrote:
Whereas the forums are supposed to be more informal.

That's your opinion, which your whole posts bases itself on as well as try to back it up; it clearly opposes to my opinion, which my previous posts bases themselves on as well have backed that up. Now, who decides what the forums are to be? Neither of us. So, please stop telling me what to do or addressing me for what doesn't match your opinion, we both are not convinced of each other's approach and I'm not sure if we will ever be...

steveL (below) wrote:
Fine let's leave it there then; just bear in mind I'm not the one throwing my weight around, reporting people left, right and centre, and causing several other users to complain about trolling.


Thanks; will do, it's a consequence that will happen when someone tries to form an informal response to a formal request, I will now aim to just agree to disagree with them right away based on the difference in formality.


Last edited by TomWij on Wed Oct 30, 2013 11:30 pm; edited 3 times in total
Back to top
View user's profile Send private message
steveL
Advocate
Advocate


Joined: 13 Sep 2006
Posts: 2164
Location: The Peanut Gallery

PostPosted: Wed Oct 30, 2013 10:59 pm    Post subject: Reply with quote

TomWij wrote:
steveL wrote:
Whereas the forums are supposed to be more informal.

That's your opinion, which your whole posts bases itself on as well as try to back it up; it clearly opposes to my opinion, which my previous posts bases themselves on as well have backed that up. Now, who decides what the forums are to be? Neither of us. So, please stop telling me what to do or addressing me for what doesn't match your opinion, we both are not convinced of each other's approach and I'm not sure if we will ever be...

Fine let's leave it there then; just bear in mind I'm not the one throwing my weight around, reporting people left, right and centre, and causing several other users to complain about trolling.

Again, I do support your position in general, often find myself agreeing with you on the mailing-list, and have always had a great deal of respect for you, based on your conduct on IRC and the help you give users.

Can we at least agree that crap code is crap code whoever writes it? </rhetorical;>
Back to top
View user's profile Send private message
TomWij
Developer
Developer


Joined: 04 Jul 2012
Posts: 1364

PostPosted: Wed Oct 30, 2013 11:29 pm    Post subject: Reply with quote

ArneBab wrote:
Actually that necessary catchup work made me resent the new EAPIs: pkgcore worked wonderfully well. Then came the next EAPI and I had to use the much slower portage again.


I wonder how much of the tree really uses the new EAPI though; it might be interesting to see how we can convert back EAPI 5 ebuilds to EAPI 4 ebuilds and just have it work again until the new EAPI is implemented, we can then do the same with EAPI 6 later on.

ArneBab wrote:
I did not know that - cool!


Not sure if that (smerge) still works though, perhaps steveL can highlight; but wasn't the thing making pkgcore not work that the EAPI 5 resolver mods were not yet finished? Sounds like a crucial part for dep tree resolution, but please correct me if I am wrong...

ArneBab wrote:
Because it The Mythical Man-Month only applies when you already have an established group working on it.


Ah, you've got me there; right, now I wonder how much people would make up an established group though. Joining up with pair programmers and understanding what they are doing can indeed be much more easy.

ArneBab wrote:
It’s great to hear that 2 people started hacking on pkgcore in July again, and since they are likely still working on really understanding the code, this would be the perfect time to throw coders at the problem.


Yeah, understanding the code as well as knowing its structure and progress is the hard part; good documentation would help with that, but I'm not sure how much of that exists. Hmm, maybe just documenting the pkgcore code is an interesting task to do on its own; because it would make it much easier for new developers to contribute...
Back to top
View user's profile Send private message
ArneBab
Guru
Guru


Joined: 24 Jan 2006
Posts: 361
Location: Graben-Neudorf, Germany

PostPosted: Wed Oct 30, 2013 11:33 pm    Post subject: Reply with quote

steveL wrote:
Glad to see you're still about Arne.




I keep using and enjoying Gentoo, though my time is much more limited these days. Luckily I could choose to install my own system for doing my PhD (the institutes admin said “if you maintain it yourself”), so that I am now using Gentoo at work, too ☺

Big task for the next week: Restore WLAN for my XO using GentooXO…

Anyway (and back OnTopic), I hope pkgcore will get active again! Sadly I could not find new stuff in its commits…
_________________
Being unpolitical means being political without realizing it. - Arne Babenhauserheide ( http://draketo.de )

pkgcore: So fast that it feels unreal - by doing only what is needed.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page Previous  1, 2, 3, 4, 5, 6, 7  Next
Page 5 of 7

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum