Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Slowing Portage
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
cwr
Veteran
Veteran


Joined: 17 Dec 2005
Posts: 1969

PostPosted: Thu Nov 13, 2014 12:31 pm    Post subject: Slowing Portage Reply with quote

There's been some recent discussion on the gentoo-devel mailing list of Portage's increasing
slowness, and someone mentioned cProfile, which I'd forgotten about. Below is a run of cProfile
on emerge, first with a single package (vim) and then with dependencies (xorg-x11). (Both
packages had already been built on the systems concerned.) The exact commands used were:
Code:

        python -m cProfile /usr/lib/portage/bin/emerge --pretend --verbose vim
        python -m cProfile /usr/lib/portage/bin/emerge --pretend --verbose --emptytree xorg-x11

The vim command was run twice, to preload the file i/o buffers, and the results are from the
second run. top showed nothing else significant running on the machine(s). The results have
been sorted in order of decreasing cumulative time, with only the first twenty entries shown.

I ran the tests both on an elderly laptop, and on an old (but still sprightly) desktop, and then
on an old but still sprightly laptop, and it's pretty clear that yes, things have changed.
How the lost speed can be regained is another matter. It might be worth adding a profiling run
to the testing of any future Portage updates.

Will

===============================================================================

32-bit system with 1 GHz single core CPU and 1 GB RAM

Portage 2009-04-24
Python 2.5.4
Kernel 2.6.32
Gnome 2.24.3

2009-vim-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 12.982 12.982 {execfile}
1 0.028 0.028 12.980 12.980 emerge:6(<module>)
1 0.001 0.001 12.534 12.534 __init__.py:14265(emerge_main)
1 0.000 0.000 8.684 8.684 __init__.py:13349(action_build)
1 0.016 0.016 4.553 4.553 __init__.py:4333(__init__)
1 0.001 0.001 3.835 3.835 __init__.py:5217(select_files)
1 0.094 0.094 2.818 2.818 __init__.py:1022(__init__)
2 0.031 0.015 2.582 1.291 vartree.py:121(_counter_hash)
3231 0.296 0.000 2.565 0.001 vartree.py:435(aux_get)
1099 0.064 0.000 2.030 0.002 __init__.py:1419(__init__)
2 0.000 0.000 1.890 0.945 __init__.py:6648(altlist)
1 0.067 0.067 1.879 1.879 __init__.py:6347(validate_blockers)
1 0.000 0.000 1.879 1.879 __init__.py:6700(_resolve_conflicts)
3245 0.061 0.000 1.854 0.001 vartree.py:384(_aux_cache)
4 1.801 0.450 1.801 0.450 {built-in method load}
2 0.000 0.000 1.793 0.897 vartree.py:390(_aux_cache_init)
1 0.111 0.111 1.426 1.426 __init__.py:5148(_dep_expand)
2398 0.058 0.000 1.354 0.001 dep.py:492(__call__)
1087 0.055 0.000 1.341 0.001 __init__.py:1109(_aux_get_wrapper)

2009-xorg-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 39.394 39.394 {execfile}
1 0.028 0.028 39.391 39.391 emerge:6(<module>)
1 0.001 0.001 38.946 38.946 __init__.py:14265(emerge_main)
1 0.002 0.002 34.829 34.829 __init__.py:13349(action_build)
1 0.001 0.001 23.368 23.368 __init__.py:5217(select_files)
1 0.008 0.008 15.577 15.577 __init__.py:4768(_create_graph)
336 0.156 0.000 15.559 0.046 __init__.py:5023(_add_pkg_deps)
2999 0.072 0.000 10.193 0.003 __init__.py:5877(_select_pkg_highest_available)
3149 0.109 0.000 9.850 0.003 __init__.py:4782(_add_dep)
721 0.201 0.000 9.457 0.013 __init__.py:5900(_select_pkg_highest_available_imp)
691 2.852 0.004 8.406 0.012 __init__.py:1883(setcpv)
1 0.295 0.295 6.918 6.918 __init__.py:7413(display)
2 0.000 0.000 6.262 3.131 __init__.py:6648(altlist)
1 0.016 0.016 4.541 4.541 __init__.py:4333(__init__)
613 0.015 0.000 4.047 0.007 __init__.py:5714(_select_atoms_highest_available)
676/616 0.153 0.000 4.041 0.007 __init__.py:6652(dep_check)
1 0.038 0.038 3.309 3.309 __init__.py:6710(_serialize_tasks)

Portage 2012-01-02
Python 2.7.2
Kernel 2.6.38
Gnome 2.32.1

2012-vim-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 16.513 16.513 emerge:5(<module>)
1 0.001 0.001 15.973 15.973 main.py:1589(emerge_main)
1 0.000 0.000 14.529 14.529 actions.py:79(action_build)
1 0.000 0.000 14.363 14.363 depgraph.py:6906(backtrack_depgraph)
1 0.000 0.000 14.362 14.362 depgraph.py:6919(_backtrack_depgraph)
1 0.000 0.000 14.288 14.288 depgraph.py:1962(select_files)
1 0.015 0.015 11.432 11.432 depgraph.py:504(_load_vdb)
1 0.000 0.000 8.229 8.229 FakeVartree.py:139(sync)
1 0.029 0.029 8.157 8.157 FakeVartree.py:165(_sync)
943 0.066 0.000 7.733 0.008 FakeVartree.py:213(_pkg)
985 0.147 0.000 7.360 0.007 Package.py:43(__init__)
967/963 0.056 0.000 3.092 0.003 FakeVartree.py:97(_aux_get_wrapper)
985 0.110 0.000 2.902 0.003 Package.py:128(_validate_deps)
6973 0.590 0.000 2.830 0.000 __init__.py:265(use_reduce)
1 0.000 0.000 2.742 2.742 depgraph.py:2264(_resolve)
21614/21198 1.200 0.000 2.679 0.000 __init__.py:1063(__init__)
985 0.049 0.000 2.278 0.002 Package.py:208(_masks)
55 0.002 0.000 1.615 0.029 depgraph.py:3456(_select_pkg_highest_available)
46 0.002 0.000 1.611 0.035 depgraph.py:3500(_select_pkg_highest_available_imp)

2012-xorg-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.002 0.002 95.672 95.672 emerge:5(<module>)
1 0.001 0.001 95.129 95.129 main.py:1589(emerge_main)
1 0.001 0.001 93.642 93.642 actions.py:79(action_build)
1 0.000 0.000 82.571 82.571 depgraph.py:6906(backtrack_depgraph)
1 0.000 0.000 82.570 82.570 depgraph.py:6919(_backtrack_depgraph)
1 0.000 0.000 82.495 82.495 depgraph.py:1962(select_files)
1 0.005 0.005 70.761 70.761 depgraph.py:2264(_resolve)
2 0.024 0.012 54.955 27.477 depgraph.py:850(_create_graph)
1465 0.022 0.000 51.724 0.035 depgraph.py:1496(_add_pkg_dep_string)
1465 0.576 0.000 51.701 0.035 depgraph.py:1510(_wrapped_add_pkg_dep_string)
893 0.158 0.000 42.511 0.048 depgraph.py:1365(_add_pkg_deps)
3802 0.087 0.000 36.455 0.010 depgraph.py:3456(_select_pkg_highest_available)
2333 0.067 0.000 36.281 0.016 depgraph.py:3500(_select_pkg_highest_available_imp)
2333 0.698 0.000 36.204 0.016 depgraph.py:3729(_wrapped_select_pkg_highest_available_imp)
7813 0.226 0.000 29.924 0.004 depgraph.py:1748(_minimize_children)
10260 0.436 0.000 29.025 0.003 depgraph.py:3359(_iter_match_pkgs)
2124 0.348 0.000 26.064 0.012 Package.py:43(__init__)

===============================================================================

32-bit system with 2.6 GHz dual core CPU and 2 GB RAM

Portage 2010-01-31
Python 2.6.4
Kernel 2.6.32
Gnome 2.28.2

2010-vim-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 4.198 4.198 {execfile}
1 0.001 0.001 4.197 4.197 emerge:6(<module>)
1 0.000 0.000 4.089 4.089 main.py:1002(emerge_main)
1 0.000 0.000 3.856 3.856 actions.py:62(action_build)
1 0.000 0.000 3.808 3.808 depgraph.py:5434(_backtrack_depgraph)
1 0.000 0.000 3.808 3.808 depgraph.py:5422(backtrack_depgraph)
1 0.000 0.000 3.762 3.762 depgraph.py:1508(select_files)
1 0.005 0.005 2.658 2.658 depgraph.py:256(_load_vdb)
1 0.000 0.000 1.887 1.887 FakeVartree.py:86(sync)
1 0.008 0.008 1.862 1.862 FakeVartree.py:117(_sync)
1363 0.024 0.000 1.722 0.001 FakeVartree.py:164(_pkg)
2726 0.130 0.000 1.116 0.000 vartree.py:481(aux_get)
1403 0.030 0.000 1.102 0.001 Package.py:35(__init__)
1 0.000 0.000 1.073 1.073 depgraph.py:1778(_resolve)
2740 0.013 0.000 0.786 0.000 vartree.py:417(_aux_cache)
4 0.777 0.194 0.777 0.194 {built-in method load}
2 0.000 0.000 0.773 0.387 vartree.py:423(_aux_cache_init)
1400/1385 0.017 0.000 0.708 0.001 FakeVartree.py:67(_aux_get_wrapper)
1403 0.021 0.000 0.652 0.000 Package.py:56(_masks)

2010-xorg-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 14.131 14.131 {execfile}
1 0.001 0.001 14.130 14.130 emerge:6(<module>)
1 0.000 0.000 14.023 14.023 main.py:1002(emerge_main)
1 0.000 0.000 13.792 13.792 actions.py:62(action_build)
1 0.000 0.000 10.222 10.222 depgraph.py:5422(backtrack_depgraph)
1 0.000 0.000 10.221 10.221 depgraph.py:5434(_backtrack_depgraph)
1 0.000 0.000 10.173 10.173 depgraph.py:1508(select_files)
1 0.000 0.000 7.479 7.479 depgraph.py:1778(_resolve)
1 0.002 0.002 5.570 5.570 depgraph.py:708(_create_graph)
331 0.021 0.000 4.774 0.014 depgraph.py:1088(_add_pkg_deps)
728 0.051 0.000 4.511 0.006 depgraph.py:1215(_add_pkg_dep_string)
1 0.090 0.090 3.570 3.570 depgraph.py:3980(display)
3547 0.024 0.000 3.017 0.001 depgraph.py:2356(_select_pkg_highest_available)
1826 0.085 0.000 2.961 0.002 depgraph.py:2379(_select_pkg_highest_available_imp)
1 0.005 0.005 2.649 2.649 depgraph.py:256(_load_vdb)
4378 0.029 0.000 2.518 0.001 depgraph.py:1304(_minimize_children)
331 0.015 0.000 2.337 0.007 porttree.py:786(getfetchsizes)

Portage 2013-01-02
Python 2.7.3
Kernel 3.5.7
Gnome 2.32.1

2013-vim-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 8.232 8.232 emerge:5(<module>)
1 0.000 0.000 8.206 8.206 main.py:969(emerge_main)
1 0.000 0.000 7.818 7.818 actions.py:3424(run_action)
1 0.000 0.000 7.630 7.630 actions.py:90(action_build)
1 0.000 0.000 7.542 7.542 depgraph.py:7385(_backtrack_depgraph)
1 0.000 0.000 7.542 7.542 depgraph.py:7372(backtrack_depgraph)
1 0.000 0.000 7.530 7.530 depgraph.py:2272(select_files)
1 0.000 0.000 4.818 4.818 depgraph.py:2606(_resolve)
2 0.000 0.000 4.273 2.137 depgraph.py:5267(altlist)
1 0.000 0.000 4.011 4.011 depgraph.py:5386(_resolve_conflicts)
1 0.044 0.044 4.010 4.010 depgraph.py:4908(_validate_blockers)
1780 0.003 0.000 3.938 0.002 depgraph.py:4032(_pkg_visibility_check)
1806 0.004 0.000 3.935 0.002 Package.py:123(visible)
1577 0.003 0.000 3.930 0.002 Package.py:117(masks)
1500 0.022 0.000 3.926 0.003 Package.py:277(_eval_masks)
1524 0.004 0.000 3.075 0.002 Package.py:109(invalid)
1500 0.081 0.000 3.071 0.002 Package.py:180(_validate_deps)
10354 0.384 0.000 3.036 0.000 __init__.py:426(use_reduce)
30331/28988 0.667 0.000 2.800 0.000 __init__.py:1207(__init__)

2013-xorg-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 39.096 39.096 emerge:5(<module>)
1 0.000 0.000 39.065 39.065 main.py:969(emerge_main)
1 0.000 0.000 38.669 38.669 actions.py:3424(run_action)
1 0.000 0.000 38.444 38.444 actions.py:90(action_build)
1 0.000 0.000 34.245 34.245 depgraph.py:7372(backtrack_depgraph)
1 0.000 0.000 34.244 34.244 depgraph.py:7385(_backtrack_depgraph)
1 0.000 0.000 34.229 34.229 depgraph.py:2272(select_files)
1 0.000 0.000 31.538 31.538 depgraph.py:2606(_resolve)
2 0.011 0.005 26.920 13.460 depgraph.py:1254(_create_graph)
2512 0.011 0.000 24.641 0.010 depgraph.py:1822(_add_pkg_dep_string)
2512 0.249 0.000 24.630 0.010 depgraph.py:1836(_wrapped_add_pkg_dep_string)
1372 0.070 0.000 17.471 0.013 depgraph.py:1682(_add_pkg_deps)
4188 0.025 0.000 14.908 0.004 depgraph.py:3872(_select_pkg_highest_available)
2476 0.021 0.000 14.858 0.006 depgraph.py:3997(_select_pkg_highest_available_imp)
2476 0.220 0.000 14.823 0.006 depgraph.py:4214(_wrapped_select_pkg_highest_available_imp)
11941 0.179 0.000 13.978 0.001 depgraph.py:3770(_iter_match_pkgs)
2 0.000 0.000 11.652 5.826 depgraph.py:5267(altlist)

===============================================================================

32-bit system with 2.2 GHz dual core CPU and 2 GB RAM

Portage 2013-01-02
Python 2.7.3
Kernel 3.5.7
Gnome 2.32.1

2013-vim-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 28.154 28.154 emerge:5(<module>)
1 0.000 0.000 28.071 28.071 main.py:969(emerge_main)
1 0.000 0.000 26.765 26.765 actions.py:3424(run_action)
1 0.000 0.000 26.124 26.124 actions.py:90(action_build)
1 0.000 0.000 25.957 25.957 depgraph.py:7372(backtrack_depgraph)
1 0.000 0.000 25.956 25.956 depgraph.py:7385(_backtrack_depgraph)
1 0.000 0.000 25.897 25.897 depgraph.py:2272(select_files)
1 0.000 0.000 15.872 15.872 depgraph.py:2606(_resolve)
2 0.000 0.000 13.775 6.888 depgraph.py:5267(altlist)
1 0.000 0.000 12.806 12.806 depgraph.py:5386(_resolve_conflicts)
1 0.128 0.128 12.804 12.804 depgraph.py:4908(_validate_blockers)
1451 0.013 0.000 12.608 0.009 depgraph.py:4032(_pkg_visibility_check)
1477 0.015 0.000 12.594 0.009 Package.py:123(visible)
1248 0.009 0.000 12.575 0.010 Package.py:117(masks)
1171 0.068 0.000 12.566 0.011 Package.py:277(_eval_masks)
31289/30100 2.203 0.000 9.950 0.000 __init__.py:1207(__init__)
1 0.021 0.021 9.907 9.907 depgraph.py:521(_load_vdb)
1195 0.009 0.000 9.746 0.008 Package.py:109(invalid)
1171 0.197 0.000 9.738 0.008 Package.py:180(_validate_deps)

2013-xorg-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 118.819 118.819 emerge:5(<module>)
1 0.000 0.000 118.735 118.735 main.py:969(emerge_main)
1 0.000 0.000 117.432 117.432 actions.py:3424(run_action)
1 0.000 0.000 116.788 116.788 actions.py:90(action_build)
1 0.000 0.000 107.424 107.424 depgraph.py:7385(_backtrack_depgraph)
1 0.000 0.000 107.424 107.424 depgraph.py:7372(backtrack_depgraph)
1 0.000 0.000 107.365 107.365 depgraph.py:2272(select_files)
1 0.000 0.000 97.256 97.256 depgraph.py:2606(_resolve)
2 0.034 0.017 82.851 41.425 depgraph.py:1254(_create_graph)
2113 0.025 0.000 75.640 0.036 depgraph.py:1822(_add_pkg_dep_string)
2113 0.574 0.000 75.615 0.036 depgraph.py:1836(_wrapped_add_pkg_dep_string)
1043 0.159 0.000 55.123 0.053 depgraph.py:1682(_add_pkg_deps)
4220 0.080 0.000 45.954 0.011 depgraph.py:3872(_select_pkg_highest_available)
2501 0.067 0.000 45.768 0.018 depgraph.py:3997(_select_pkg_highest_available_imp)
2501 0.593 0.000 45.654 0.018 depgraph.py:4214(_wrapped_select_pkg_highest_available_imp)
11861 0.483 0.000 42.327 0.004 depgraph.py:3770(_iter_match_pkgs)
17211 0.164 0.000 33.258 0.002 depgraph.py:4032(_pkg_visibility_check)
17211 0.067 0.000 33.056 0.002 Package.py:123(visible)
2369 0.017 0.000 32.984 0.014 Package.py:117(masks)

Portage 2014-01-02
Python 2.7.5
Kernel 3.10.17
Gnome 2.31.1

2014-vim-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 37.328 37.328 emerge:5(<module>)
1 0.000 0.000 37.232 37.232 main.py:971(emerge_main)
1 0.000 0.000 34.872 34.872 actions.py:3610(run_action)
1 0.000 0.000 34.794 34.794 actions.py:95(action_build)
1 0.000 0.000 34.463 34.463 depgraph.py:7900(backtrack_depgraph)
1 0.000 0.000 34.460 34.460 depgraph.py:7913(_backtrack_depgraph)
1 0.000 0.000 34.403 34.403 depgraph.py:2637(_select_files)
1 0.000 0.000 34.403 34.403 depgraph.py:2622(select_files)
1 0.000 0.000 19.760 19.760 depgraph.py:2996(_resolve)
2 0.000 0.000 17.780 8.890 depgraph.py:5725(altlist)
1 0.000 0.000 16.976 16.976 depgraph.py:5853(_resolve_conflicts)
1 0.125 0.125 16.975 16.975 depgraph.py:5369(_validate_blockers)
1801 0.015 0.000 16.744 0.009 depgraph.py:4469(_pkg_visibility_check)
1826 0.020 0.000 16.729 0.009 Package.py:156(visible)
1631 0.012 0.000 16.705 0.010 Package.py:150(masks)
1568 0.098 0.000 16.693 0.011 Package.py:305(_eval_masks)
15913 1.668 0.000 16.382 0.001 __init__.py:409(use_reduce)
1 0.000 0.000 14.520 14.520 depgraph.py:531(_load_vdb)
40886/37330 3.062 0.000 13.618 0.000 __init__.py:1190(__init__)

2014-xorg-cumul.txt

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 286.182 286.182 emerge:5(<module>)
1 0.000 0.000 286.085 286.085 main.py:971(emerge_main)
1 0.000 0.000 283.718 283.718 actions.py:3610(run_action)
1 0.000 0.000 283.641 283.641 actions.py:95(action_build)
1 0.000 0.000 272.648 272.648 depgraph.py:7900(backtrack_depgraph)
1 0.000 0.000 272.647 272.647 depgraph.py:7913(_backtrack_depgraph)
2 0.000 0.000 272.587 136.293 depgraph.py:2622(select_files)
2 0.000 0.000 272.586 136.293 depgraph.py:2637(_select_files)
2 0.000 0.000 247.803 123.902 depgraph.py:2996(_resolve)
5 0.090 0.018 172.994 34.599 depgraph.py:1581(_create_graph)
5367 0.063 0.000 151.937 0.028 depgraph.py:2165(_add_pkg_dep_string)
5367 1.506 0.000 151.874 0.028 depgraph.py:2179(_wrapped_add_pkg_dep_string)
3 0.000 0.000 126.268 42.089 depgraph.py:5725(altlist)
2837 0.415 0.000 107.741 0.038 depgraph.py:2025(_add_pkg_deps)
38224 1.557 0.000 103.709 0.003 depgraph.py:4158(_iter_match_pkgs)
2 0.000 0.000 91.256 45.628 depgraph.py:5853(_resolve_conflicts)
9309 0.184 0.000 78.969 0.008 depgraph.py:4260(_select_pkg_highest_available)
5711 0.132 0.000 78.539 0.014 depgraph.py:4426(_select_pkg_highest_available_imp)
5711 1.470 0.000 78.303 0.014 depgraph.py:4655(_wrapped_select_pkg_highest_available_imp)

===============================================================================
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Thu Nov 13, 2014 1:41 pm    Post subject: Reply with quote

- EAPI handling, the higher the EAPI version support by portage, the higher the complexity of EAPI, the slower it will be.
- portage features: some features are default enable, the older the portage version, the lighter or the less default features it would have, making it faster.
- vim, xorg... different version of programs can lead to high difference in ebuild or dependencies, maybe not for vim, but obviously, nobody could say an 1.4 xorg would be as light as an 1.0 xorg.

in order to see portage slowness, use same ebuild version and run it against two portage version that handle that EAPI version, using a make.conf set explicity with all features... enable and disable.
You would get results that show difference in portage, handling the same EAPI and complexity, on the same ebuild.

Because test like that is like if you want to tell an user is faster or slower at reading words by asking them to read a text.
The problem is that user1 read a 1000 words text, and user2 is reading a 3000 words text.
Even user2 is faster at reading a word, result could only show user1 is faster.

I would agree i feel portage is slower than before, but at least give it a real test scenario. This looks more like a "Who wants to drown his dog accuses him of rabies" solution (french expression translate, while it's typical french expression, i think anyone get it even in english)
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Mon Feb 23, 2015 2:59 am    Post subject: Reply with quote

Or better yet, have a 32 bit platform (AKA Intel Pentium, i686, i586, ...) box alongside a 64 bit box with similar installed packages and compare the profiler output.

Examining the statistics will exponentially give you a good idea where the slow downs are occurring. Matter of fact, performing your tests entirely on those older processor platforms will show some exponential time lapses within the profiler output.

But I can tell you from my experience, Python over all is extremely cumbersome for any 32 bit i686 processors! Along with the non-standardized Python or non-backwards compatible Python functions, I tend to stick with Bourne shell or Bash scripting, as they run much faster than Python. With my i7-3770K with 32 GB RAM, I have no problems, but I still rely on Bash so others who maybe on older platforms are not hindered by my scripting/code.
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Mon Feb 23, 2015 4:15 pm    Post subject: Reply with quote

90% of Portage's slowness comes from the fact it has to parse 100k Bash scripts using the Bash shell. There was a project to write a libbash for this, which would've made that 90% go away, but like many Gentoo things it never got adopted.
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Mon Feb 23, 2015 5:40 pm    Post subject: Reply with quote

Question, libbash being written in C/C++ or Python?

Sounds like more redundant code, requiring more human resources dedicated to maintenance of the new code.

If one simply compares 32 bit performance to 64 bit performance, Python overall simply requires a lot of resources to begin with. Probably best to get to the root of the problem, versus dancing around in my opinion. I'm guessing this is why the idea of libbash was dropped. But with all that said, I also realize the important portability features of Python, as well as the fact Python simplifies tasks; probably out-weighed the emerge command possibly being written in Bourne Shell or BASH. And Python being easier to write than C or C++, hence bugs are likely fixed much faster.

I should also note within krinn's excellent above explanation of slowing, I'm pretty sure I'm seeing a lot of EBuild author's simply citing and implying "EAPI=5" within all their EBuild's instead of noticing that they do not use or do not need EAPI=5. A good example is my recent rewrite of the SiteCopy ebuild (ie. Gentoo Bug #500070) for which somebody recently questioned not using the later version. Later version incorrectly citing EAPI=5 is better than EAPI=2, when in reality the higher number only includes more features, hence significantly increasing overhead for smaller or slower platforms!
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Mon Feb 23, 2015 9:02 pm    Post subject: Reply with quote

Furthermore, the majority of code in /usr/lib/portage/python2.7/ is also Bash scripts - ebuilds are executed by using the 25KB ebuild.sh file as a wrapper, which in turn loads four more Bash scripts bringing the amount to 100KB of Bash code loaded. This is, of course, the overhead Bash imposes before portage can even do a single thing with said ebuild.
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Tue Feb 24, 2015 4:42 am    Post subject: Reply with quote

Ant, I don't understand. It's as if you're saying Python is faster than Bash?

From my benchmarking here, Python is much slower and requires more resources than Bourne Shell or Bash. An example of code faster than Bourne Shell or Bash, is C; or C++ if you must. Code faster than C or C++ is Assembly Language.

Of course there maybe a point where Python may run faster on more faster or resourceful systems with larger amounts of data, but on x86 32 bit my benchmarks show Bourne Shell or Bash to be faster by far, likely due to limited CPU speed.
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Tue Feb 24, 2015 5:01 pm    Post subject: Reply with quote

My real-world experience doesn't match that benchmark: Paludis (C++11) is by far the slowest package manager I've ever used by an order of magnitude, using exactly the same data as Portage.

I'd like to see how you're measuring these.
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Wed Feb 25, 2015 3:08 am    Post subject: Reply with quote

I've also researched Paludis within the past month or so, and upon grepping the configure.in found the following:

/var/tmp/portage/sys-apps/paludis-2.0.0/work/paludis-2.0.0/configure: as_fn_error $? "Bad ricer. No bagel. Try again with non-broken compiler flags." "$LINENO" 5
/var/tmp/portage/sys-apps/paludis-2.0.0/work/paludis-2.0.0/configure.ac: AC_MSG_ERROR([Bad ricer. No bagel. Try again with non-broken compiler flags.])

Real world experience is like comparing apples to oranges, when we're talking facts. ;-)
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
cwr
Veteran
Veteran


Joined: 17 Dec 2005
Posts: 1969

PostPosted: Wed Feb 25, 2015 1:30 pm    Post subject: Reply with quote

The aim of the original post was nol to compare Bash with Python, but to compare the standard ebuild
usage with the standard ebuild usage. If you think the tests aren't relevant, that's fair enough, but
you then need to produce more and better tests to show that your views are correct. Just saying
you don't like the answers doesn't cut it.

Will
Back to top
View user's profile Send private message
nitm
n00b
n00b


Joined: 27 Dec 2004
Posts: 63

PostPosted: Fri Jul 10, 2015 10:53 pm    Post subject: Reply with quote

Portage is not just getting slower. It is getting unbareably slow.
And the limitting factor is not disk, but cpu:
Code:
# time emerge -avDuN world -p

real    1m14.206s
user    1m13.334s
sys     0m0.856s

# time emerge -avDuN world -p
real    1m14.121s
user    1m13.313s
sys     0m0.790


Two runs one after the other - same time.
Is there any bug in the bug tracker where I can vote for?

Why is so much time wasted in dep resolving?
Why would portage load and parse bash files for this and not use some custom format that has all the data preprocessed?

edit: this is on core2quad @ 3.2ghz, 8ghz ddr2 and 512gb ssd!
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Sat Jul 11, 2015 12:19 am    Post subject: Reply with quote

Python is just inherently slow. Compare Python to Bash, SED and AWK, or C just using something like printf.

Atop of this, the more features added without thoroughly thinking through "Programming from the Ground Up" techniques, will tend to also cause slowing.

About the only thing I see Python good for, are Python's cross platform capabilities. Other than this; Bash, SED or AWK, and C can easily outperform Python.
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
bstaletic
Apprentice
Apprentice


Joined: 05 Apr 2014
Posts: 253

PostPosted: Sat Jul 11, 2015 12:30 am    Post subject: Reply with quote

Code:
emerge -avuDNp @world  7.93s user 0.07s system 99% cpu 8.058 total
emerge -avuDNp @world  7.85s user 0.09s system 99% cpu 7.955 total

i5 2500 and 8GB DDR3 and an ancient HDD.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Sat Jul 11, 2015 10:18 am    Post subject: Reply with quote

Ant P. wrote:
90% of Portage's slowness comes from the fact it has to parse 100k Bash scripts using the Bash shell.

No, it doesn't. That's what the metadata cache is for.

The slowdown over last year or 18 months is caused by multilib, which is now established and accepted, so it's just getting worse.
Back to top
View user's profile Send private message
Ottre
Tux's lil' helper
Tux's lil' helper


Joined: 23 Dec 2012
Posts: 129

PostPosted: Sat Jul 11, 2015 2:10 pm    Post subject: Reply with quote

rogerx wrote:
Python is just inherently slow. Compare Python to Bash, SED and AWK, or C just using something like printf.

Other than this; Bash, SED or AWK, and C can easily outperform Python.


You can't simply re-write the python code in bash. The maximum bash version allowed by PMS is 3.2.

That means no associative arrays, pretty important if you want your code to be readable.

There's no local -n, so you have to spawn a subshell if you want a function to return a string, which is a big performance hit.
Back to top
View user's profile Send private message
The_Great_Sephiroth
Veteran
Veteran


Joined: 03 Oct 2014
Posts: 1602
Location: Fayetteville, NC, USA

PostPosted: Sat Jul 11, 2015 2:22 pm    Post subject: Reply with quote

I am just curious as to why this isn't in C or C++. I would think that such a core functionality would be best served by actual binaries.
_________________
Ever picture systemd as what runs "The Borg"?
Back to top
View user's profile Send private message
Ottre
Tux's lil' helper
Tux's lil' helper


Joined: 23 Dec 2012
Posts: 129

PostPosted: Sat Jul 11, 2015 3:32 pm    Post subject: Reply with quote

The_Great_Sephiroth wrote:
I am just curious as to why this isn't in C


I don't think straight C is a good option, because you need regular expressions to do package management. The external libraries you could use (eg PCRE) have had a bunch of security vulnerabilities.
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Sat Jul 11, 2015 6:49 pm    Post subject: Reply with quote

Ottre: Some interesting tidbits there on why Portage was written using Python instead of Bash or C.

Ditto concerning Bash sub-shells, as Bash sub-shells inherently loose script global variable definitions as well. (Just tracked a long unknown bug within abcde CD utility, likely using sub-shells without knowing this effect.)

But given enough time; I think Bash, Sed and/or AWK could be utilized to rewrite Portage with far better results. Personally, I still see Python as somewhat of a trend or "what's hip" versus what is best, and gambling faster processing units will offset the slowness. (I looked at Paludis C++ version of Portage awhile ago, and was somewhat dismayed at it's direction, leadership or sense of responsibility.)

However, everybody (whom doesn't like Python) is probably stuck with this issue concerning Portage slowness due to Python now. Nor do I want to open a can of worms before I get hungry enough to eat the whole can!
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
nitm
n00b
n00b


Joined: 27 Dec 2004
Posts: 63

PostPosted: Sat Jul 11, 2015 7:32 pm    Post subject: Reply with quote

rogerx wrote:

But given enough time; I think Bash, Sed and/or AWK could be utilized to rewrite Portage with far better results.

Are you really suggesting that implementing graph traversal algorithms in bash/sed/awk will be faster than python or c/c++?
Back to top
View user's profile Send private message
haarp
Guru
Guru


Joined: 31 Oct 2007
Posts: 535

PostPosted: Sat Jul 11, 2015 7:41 pm    Post subject: Reply with quote

rogerx wrote:
Other than this; Bash, SED or AWK [...] can easily outperform Python.

what?
Back to top
View user's profile Send private message
hasufell
Retired Dev
Retired Dev


Joined: 29 Oct 2011
Posts: 429

PostPosted: Sat Jul 11, 2015 8:26 pm    Post subject: Reply with quote

It's the input.
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10589
Location: Somewhere over Atlanta, Georgia

PostPosted: Sun Jul 12, 2015 12:09 am    Post subject: Reply with quote

The_Great_Sephiroth wrote:
I am just curious as to why this isn't in C or C++. I would think that such a core functionality would be best served by actual binaries.
Then you must see this as a distinct advantage for systemd, right? ;)

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
rogerx
Tux's lil' helper
Tux's lil' helper


Joined: 06 Apr 2004
Posts: 118

PostPosted: Sun Jul 12, 2015 2:27 am    Post subject: Reply with quote

Geez. I knew it wouldn't be long before the complaining would start. Shrugs.

Anyways, I didn't get the comment concerning SystemD. (I'm not a fan of SystemD, due to it's complications.)

Updated: I missed a comment whom misconceived a previous comment. No I'm not suggesting Bash/SED/AWK being faster than C/C++. From my benchmarking, fastest to slowest language; Assembly, C/C++, Bash/SED/AWK, Python. Python tends to require large amounts of overhead (or system resources) in order to be robust.
_________________
Roger
http://rogerx.freeshell.org/
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10589
Location: Somewhere over Atlanta, Georgia

PostPosted: Sun Jul 12, 2015 3:03 am    Post subject: Reply with quote

My intuition is that you're incorrect about Python being significantly slower than Awk and for that matter Awk being comparable to Bash as Bash has the reputation of being relatively slow. Sed doesn't even count since it's not anywhere near a complete programming language. I'd be interested in what benchmarks you used to come to this conclusion. (I'd also claim that such a benchmark for Sed doesn't even exist.)

Speaking of facts, here are a few trivial "empty loop" benchmarks for you:
test01.py:
j = 1
for i in range(1000000):
    j += 1

print("All done. j = " + repr(j))
and
test01.bash:
#!/bin/bash

j=0
for ((i=0; i<=1000000; i++)) ; do
    let j++
done

echo "All done. j = $j"
and
test01.awk:
BEGIN {
    j = 0;
    for (i = 0; i <= 1000000; i++)
        j++;
    print "All done. j = " j;
}

Their respective execution times? Here you go:
Code:
jgraham@hal ~ $ time python test01.py
All done. j = 1000001

real    0m0.323s
user    0m0.307s
sys     0m0.013s
jgraham@hal ~ $ time ./test01.bash
All done. j = 1000001

real    0m12.840s
user    0m12.823s
sys     0m0.013s
jgraham@hal ~ $ time awk -f test01.awk
All done. j = 1000001

real    0m0.232s
user    0m0.230s
The above shows Python to be roughly 40 times faster than Bash and Awk 55 time faster than Bash in basic control flow. So, where in your mind does Bash shine over Python?

systemd is written in C, which as a "core functionality" is as it should be, according to some. Means it's fast, right? ;)

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
Akkara
Bodhisattva
Bodhisattva


Joined: 28 Mar 2006
Posts: 6702
Location: &akkara

PostPosted: Sun Jul 12, 2015 4:25 am    Post subject: Reply with quote

John R. Graham wrote:
Sed doesn't even count since it's not anywhere near a complete programming language.

Actually, sed is Turing-complete. It just isn't a particularly welcoming language to work in.

Here's your benchmark, in sed. Run it with
    echo 0 | time sed -f test01.sed >/dev/null

Code:
:a
s:$:#:
:b
s:9#:#0:
s:8#:9:
s:7#:8:
s:6#:7:
s:5#:6:
s:4#:5:
s:3#:4:
s:2#:3:
s:1#:2:
s:0#:1:
s:^#:1:
/#/bb
p
/1000000/!ba

On this laptop, it takes around 2.5 seconds to run. No idea how the laptop compares with your machine, but if they are similar, it would make sed roughly 4-5x faster than bash.

(Btw, I'm not a sed expert. I don't know if this is the best way to code this.)

Edit: It occurred to me that the counter doesn't have to count in base 10. Maybe a smaller base counts faster because of fewer substitutions in the increment-digit part. So I tried a few, adjusting the stopping match as needed to represent one million in whatever base I was using.

It turns out it doesn't make a big difference. Base 2 is the worst at about 60% slower, probably because carries occur more frequently making it go around the inner loop more times. Base 3 is roughly 10% slower than base 10. Base 4, 5, and 6 are about the same, at 5-6% faster than base 10. Beyond that, the times gradually increase with increasing base.

Here's the base-5 version:
Code:
:a
s:$:#:
:b
s:4#:#0:
s:3#:4:
s:2#:3:
s:1#:2:
s:0#:1:
s:^#:1:
/#/bb
p
/224000000/!ba

_________________
Many think that Dilbert is a comic. Unfortunately it is a documentary.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum