Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Rant: Nepomuk and strigi - solution in search of a problem
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4, 5  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
navistrar
n00b
n00b


Joined: 13 Aug 2011
Posts: 3

PostPosted: Sat Aug 13, 2011 8:07 pm    Post subject: Getting rid of Nepomuk / Strigi / Akonadi. Bigger Picture Reply with quote

After hours of search, my only findings regarding these - Big Brother / Facebook / cnt liv wout my clphon generation - features are multitudes of requests for their REMOVAL. Not ability to "disable them". If we can DISABLE them, why can't we REMOVE them? And better yet... Why can't we CHOOSE to install KDE without Strigi/Akonadi/Nepomuk if we don't use PIM apps nor want file indexing?

I looked into compiling KDE without Akonadi/Nepomuk/Strigi. Compiling kdebase-runtime has 3 dependencies and strigi is one of them. Yet in 'system settings' I can turn off strigi?!? Kind of makes you wonder if "turning it off" really "turns it off".

Has anyone found a way to compile KDE without this plague?
Back to top
View user's profile Send private message
albright
Advocate
Advocate


Joined: 16 Nov 2003
Posts: 2109
Location: Near Toronto

PostPosted: Sun Aug 14, 2011 1:13 pm    Post subject: Reply with quote

This isn't about getting rid of strigi/nepomuk etc but limiting their
intrusiveness.

I noticed a program called cpulimit which lets you set a cpu limit
for any process. It's very simple and tiny.

I've set virtuoso-t to use less than 25% cpu and it seems to be
working (but we'll see if the indexing *ever* settles down)

anybody tried this? Is there a better way to limit cpu useage of
kde indexing (I know you can limit memory size of virtuoso-t).
_________________
.... there is nothing - absolutely nothing - half so much worth
doing as simply messing about with Linux ...
(apologies to Kenneth Graeme)
Back to top
View user's profile Send private message
mv
Advocate
Advocate


Joined: 20 Apr 2005
Posts: 4221

PostPosted: Sun Aug 14, 2011 2:22 pm    Post subject: Re: Getting rid of Nepomuk / Strigi / Akonadi. Bigger Pictur Reply with quote

navistrar wrote:
Has anyone found a way to compile KDE without this plague?

Since apparently KDE upstream does not support this, it would require heavy patching to do this. I doubt that this can be done in the long run without convincing upstream. I am already very grateful that the gentoo developer's succeeded to make at least nepomuk optional, although this also is probably not really supported by upstream. For this reason, gentoo is currently the only distribution with a reasonable DE besides xfce.
Back to top
View user's profile Send private message
navistrar
n00b
n00b


Joined: 13 Aug 2011
Posts: 3

PostPosted: Tue Aug 16, 2011 3:49 am    Post subject: Reply with quote

albright wrote:
... limiting their intrusiveness...


Exactly. I can tell you "get it". And your way of putting it is much more eloquent than mine :) It IS about intrusiveness. What set me off, during my search for an answer, is running into a series of exchanges between KDE developers advocating "direction" which seems focused strictly on catering to/pleasing the "social networking" crowd. These are people who, for the most part, have no clue of the concept of privacy. Simply put... KDE SOLD OUT!

On a more optimistic note, this led me on a quest to weed out not only KDE but every program having anything to do with KDE (kdebase-runtime) and build a new system based on Xfce. So I went from "hmmm" to "not bad" to "I like this a lot" to "I'm totally excited about my new system".

So after 8+ years on KDE, I'm once again a Linux newbie (XFCE) :)
Back to top
View user's profile Send private message
Yamakuzure
Veteran
Veteran


Joined: 21 Jun 2006
Posts: 1402
Location: Bardowick, Germany

PostPosted: Tue Aug 16, 2011 12:00 pm    Post subject: Reply with quote

Well, I tried. I tried with awesome, E17, Xfce, LXDE, FluxBox and OpenBox-only, and all have two things in common: 1) I need too many KDE apps to get my work done and b) everything looks extremely ugly. But at least KDE works rather well since 4.6.4, and 4.7.0 made me look into the SystemSettings thrice already to check whether Nepomuk and Strigi are really still running...
_________________
systemd - The biggest fallacies
Back to top
View user's profile Send private message
navistrar
n00b
n00b


Joined: 13 Aug 2011
Posts: 3

PostPosted: Tue Aug 16, 2011 7:15 pm    Post subject: Reply with quote

Yamakuzure wrote:
I need too many KDE apps to get my work done and b) everything looks extremely ugly.

After using KDE for nearly 10 years, believe me, I felt the same way. But once I made the decision to get rid of KDE and all its programs and started really researching the replacements, I concluded 2 things. 1) There is an alternative app/way for everything but it does require an adjustment and getting used to period (for a greater benefit IMO) and 2) It can all be made to look just as 'pretty' - with an added benefit of satisfaction that it was made that way 'just for you and your needs'

Yamakuzure wrote:
...at least KDE works...

My hat's off to the developers up until this point. KDE is slick, very visually appealing and loaded with great features (IMO). However, forcing the evil triples (Akonadi / Nepomuk / Strigi) on users now makes it evident that 'they' sold out to the 'social networking' bunch - making KDE a system that can't be trusted with private/confidential data. But it does look good :)
Back to top
View user's profile Send private message
Yamakuzure
Veteran
Veteran


Joined: 21 Jun 2006
Posts: 1402
Location: Bardowick, Germany

PostPosted: Tue Aug 23, 2011 9:22 am    Post subject: Reply with quote

@navistar: Well, I tried. But there is nothing comparable to Konsole+Yakuake (*). And I haven't found any subversion client that fits my needs like kdesvn does. Dolphin, once set up (the default setup is a pain, really) is better than any other file manager I've ever seen (including "Directory Opus"). The combination of kwrite/kate is simply awesome, as they are both powerful and yet simple to use. These are just some examples of course.

But of course you are right. Forcing the "Deathly Triple" onto every user is, I totally agree here, a bad thing to do. On my notebook I am using kdepim every day and need those. But on my desktop machine, which is dedicated to development and controlling, I'd disabled them if I had the chance. Here more choice would be a lot better.

(*) I must add that I need the encoding feature of Konsole rather often. And it is very helpful to simply "switch" the Konsole to ISO-8859-15 or IBM850 before cating a text file to look whether a conversion was successful.
_________________
systemd - The biggest fallacies
Back to top
View user's profile Send private message
Randy Andy
Veteran
Veteran


Joined: 19 Jun 2007
Posts: 1058
Location: /dev/koelsch

PostPosted: Tue Aug 23, 2011 12:27 pm    Post subject: Reply with quote

[Edit] Sorry for confusing, i wrote these post after reading only page one, of this long thread - let's see if it's outdated (i'm now reading the rest) :?

Yes NightMonkey,
here @work, on my bloody f.. Win.Vista System, i can state that the search function of the explorer (filemanager) slows down my Computer up to 50% of its CPU Power.

Could it be a different thing (and the advantage in Linux), if using the inotify / fnotify functionality of the linux kernel, telling the indexer directly what kind of files have been changed, instead of let searching a tool all the time by itself on the whole drive? Eventually the mounting options of files (atime/ noatime / relatime...) has although some influence too.

@albright
I'm interested in, which files doen't taken by the indexer. Do you find a regularity or special files?
On my 64-Bit-Win-Vista_PC, especially *.svg files doesn't noticed fast enough by the indexer. After creating them with Illustrator, these won't be found by explorer via the recent files entry, but others do, e.g. the same files as *.ai, created few seconds before - ha!

Different funny thing:
Few days ago, i copied parallel on this machine the content of two usb-sticks to a connected USB-hard drive.
After starting copying the first stick, i got 50% CPU load, after starting the second one also, i got 100 % for the whole time of copying -haha.
Ok, might be the influence of the realtime AV-scan engine i can't measure (and i don't need on pure linux) , cause i can't see these tasks without admin rights here.
No need to say how slow the machine reacts during this. Ok, its only a Dell Precision Workstation, with 4 drives Raid system, Xeon-QuadCore-2,93Ghz, and 12GB RAM, maybee too slow for these needs :wink:

Nevertheless, when i did the same process @home, on my weaker machine with Linux, and got only about 15% +-5% load. (on KDE-4.7, indexing of strigi activated).

Regarding the indexing of files on kde-4.x some of my findings of different configurations, which could influence the memory and CPU consumption massivly, at least on my machines.
The consumption of strigi increases up to 100% for long time and doesn't drop down to moderate values, if i adjusted it's memory settings to more than 200MB on my strongest machine @home (QuadCore 2,4 Ghz with 8GB RAM).
With 200MB memory settings, i got a short peak for about 5 seconds, after logon to KDE, then it drops down to zero.
On my old Atom notebook, with 1GB RAM, i only uses 50MB memory for strigi, which works fine for me too.
And you shouldn't let index the whole hard drive (not your linux system with lots of files) only the directories you are interested in, like your home and other archive drives directories.

With these settings the whole nepomuk- strigi- akonadi- semantik-desktop crap, works good enough for me, so that i have no need to deinstall it.
Althogh i doesn't need it really, it's not as bad to deinstall it, nor it influences my desktop behaviour or system stability in a negative way.

So i'm wondered why i read as often such a lot of trouble of others, having with this tools.
Could it all have to do with different configuration settings as mentioned (kde+kernel+mountoptions).
Is it something we should try to invest deeper and more systematically to find the real root cause of this.

What does the community think about it?
Does any useful configuration wiki exist, covering these aspects in one document.

Andy.
_________________
If you want to see a Distro done right, compile it yourself!
Back to top
View user's profile Send private message
turtles
Veteran
Veteran


Joined: 31 Dec 2004
Posts: 1235

PostPosted: Thu Dec 01, 2011 7:13 pm    Post subject: Reply with quote

I wanted to wake up this thread to see if any of you really want to fix this.

It would take a few days of some competent coders time to make some patches to kde and probably need to go in to a Gentoo overlay.
It will take only 2 major changes.

1 I have started working on compiling kde with out strigi

2 From what I can tell the kmail composer search and mail search hard depend on libnepomuk4.
As well as tooltips. Assuming you all don't need tooltips we just have to build kmail to use the old grep -r command.
And remove the call to libnepomuk4.

Anyone want a minimal kde build option with out installing strigi or nepomuk?
I am not however talking about a functional KDE as a DE just kmail kontact and similar apps.

Thoughts?
_________________
Donate to Gentoo
Back to top
View user's profile Send private message
turtles
Veteran
Veteran


Joined: 31 Dec 2004
Posts: 1235

PostPosted: Fri Dec 09, 2011 2:56 am    Post subject: Reply with quote

For anyone whom is interested in not using KDE as a WM and just use kde apps
kdiff3, okular, kmail, kontact, kwalletd etc.
-knotify -nepomuk -strigi
I am working on it on it.

Just to make sure I am not duplicating efforts I checked out kde-sunset and trinity.
I remember tons of bugs in kde 3.5.10
Plus the bloat of having kde3 and kde4 is too much;
for example:
Just to have kmail-3.5.10 -nepomuk -strigi
Code:
These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild  NS    ] sys-devel/automake-1.9.6-r3 [1.10.3, 1.11.1] 748 kB [0]
[ebuild  N     ] net-misc/mDNSResponder-212.1  USE="java -debug -doc" 1,575 kB [0]
[ebuild  N     ] net-dns/libidn-1.22  USE="java nls static-libs -doc -emacs -mono" 3,278 kB [0]
[ebuild  NS    ] kde-base/kdelibs-3.5.10-r6 [4.6.5-r2] USE="acl alsa branding cups fam jpeg2k spell tiff -arts -avahi -bindist -debug -doc -kdehiddenvisibility -kerberos -legacyssl -lua -openexr -utempter" 15,270 kB [1]                           
[ebuild  N     ] kde-base/libkmime-3.5.10  USE="-debug" 14,219 kB [1]
[ebuild  N     ] kde-base/libkdenetwork-3.5.10-r1  USE="-debug" 0 kB [1]
[ebuild  NS    ] kde-base/libkonq-3.5.10 [4.6.5] USE="-debug -kdehiddenvisibility" 23,770 kB [1]
[ebuild  N     ] kde-base/kdebase-data-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/mimelib-3.5.10  USE="-debug" 0 kB [1]
[ebuild  NS    ] kde-base/libkpgp-3.5.10 [4.4.11.1] USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/libksieve-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/kmailcvt-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/ktnef-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/kcminit-3.5.10  USE="-debug -kdehiddenvisibility" 0 kB [1]
[ebuild  N     ] kde-base/khotkeys-3.5.10  USE="-debug -kdehiddenvisibility" 0 kB [1]
[ebuild  NS    ] kde-base/kdesu-3.5.10 [4.6.5] USE="-debug -kdehiddenvisibility" 26 kB [1]
[ebuild  N     ] kde-base/kdialog-3.5.10  USE="-debug -kdehiddenvisibility" 0 kB [1]
[ebuild  N     ] kde-base/kmenuedit-3.5.10  USE="-debug -kdehiddenvisibility" 0 kB [1]
[ebuild  N     ] kde-base/certmanager-3.5.10-r1  USE="-debug" 24 kB [1]
[ebuild  N     ] kde-base/libkcal-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/kdebase-kioslaves-3.5.10-r1  USE="-debug -hal -kdehiddenvisibility -ldap -openexr -samba" 0 kB [1]                                                                                                                         
[ebuild  N     ] kde-base/kdepim-kioslaves-3.5.10-r1  USE="sasl -debug" 131 kB [1]
[ebuild  N     ] kde-base/kicker-3.5.10-r2  USE="-debug -kdehiddenvisibility -xcomposite" 0 kB [1]
[ebuild  N     ] kde-base/libkdepim-3.5.10  USE="-debug" 0 kB [1]
[ebuild  NS    ] kde-base/khelpcenter-3.5.10 [4.6.5] USE="-debug -kdehiddenvisibility" 0 kB [1]
[ebuild  N     ] kde-base/libkpimidentities-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N     ] kde-base/kcontrol-3.5.10  USE="joystick opengl -arts -debug -ieee1394 -kdehiddenvisibility -logitech-mouse" 0 kB [1]
[ebuild  N     ] kde-base/kontact-3.5.10  USE="-debug" 0 kB [1]
[ebuild  N    ~] kde-base/kmail-3.5.10-r2  USE="crypt -debug" 0 kB [1]

Total: 29 packages (23 new, 6 in new slots), Size of downloads: 59,039 kB
Portage tree and overlays:
 [0] /usr/portage
 [1] /usr/portage/local/layman/kde-sunset

Would you like to merge these packages? [Yes/No] 


No thanks.
Gentoo is about choice.
I wont bump this again until I have something to share.
Get in touch if you are interested in helping.
thanks
_________________
Donate to Gentoo
Back to top
View user's profile Send private message
mahdi1234
Guru
Guru


Joined: 19 Feb 2005
Posts: 491
Location: far from new world orderia

PostPosted: Wed Dec 14, 2011 11:59 am    Post subject: Reply with quote

turtles wrote:
For anyone whom is interested in not using KDE as a WM and just use kde apps
kdiff3, okular, kmail, kontact, kwalletd etc.
-knotify -nepomuk -strigi
I am working on it on it.


Great news :) I'm no coder, so I couldn't help with that part, but no problem on my side to help with testing the ebuilds (x86).

PS - I'm LXDE user with Akregator, so pretty keen on dropping all that strigi/nepomuk/akonadi crap

cheers
Back to top
View user's profile Send private message
radio_flyer
Apprentice
Apprentice


Joined: 04 Nov 2004
Posts: 167
Location: Northern California

PostPosted: Mon Dec 19, 2011 10:17 am    Post subject: Reply with quote

Personally, I'd love to see this KDE semantic desktop make progress, but I think the KDE developers really screwed the pooch on the way they went about implementing it. I think a fully functional semantic desktop is still a decade away, and I personally believe it won't be really usable until it's implemented as part of something like a content-addressable ext8 filesystem on the 1024-core 100T RAM system you can buy off newegg in 2020. Until then, they should have made the thing optional so it doesn't annoy all of the folks who aren't interested in bleeding at the edge of computer science.

Randy Andy, I agree with your belief that inotify would be great for avoiding all that time-wasting re-indexing. However, this page http://techbase.kde.org/Development/Tutorials/Metadata/Nepomuk/FileWatchService seems to indicate that inotify is totally useless for the most part. In some respects, nepomuk is remodeling a fancy bath in which the plumbing doesn't yet meet building codes.

What I don't understand are the comments in this forum about KDE selling out to the 'social networking' crowd. Near as I can tell, this KDE semantic desktop stuff is more along the lines of the thoughts expressed in this article: http://radar.oreilly.com/2011/07/why-files-need-to-die.html . However, also as near as I can tell, KDE isn't anywhere near to implementing something like that. Their 'ontology' for their RDF dataset is this: http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/ It seems to me most of that information is already available as file extensions, file attributes, or via some combination of 'grep', 'ls' and 'find'.

The problem I see is that a true semantic desktop is going to be hard. superstoned is right--some of us Gentoo guys, like myself, are willing to play with it. I've been trying out what I can figure out. However, I have to agree with what albright posted: the current implementation is broken to the core. After deleting the first index nepomuk created because it crashed nepomuk every time with some sort of internal inconsistency, I re-ran the indexer and now have an approximately ~1G RDF dataset file. One application I have for something like this is PDF indexing: I have thousands of technical documents with names like 'an835.pdf' scattered in various folders. 'an835.pdf' might be a 10-page application note from Xilinx on doing FIR filters in an FPGA. What I'd like to do with nepomuk/strigi is ask it for the location of every pdf created by Xilinx that talks about FIR filters in FPGAs. However, it doesn't appear to me that KDE has any way of querying the database in that way; in fact, I'm not sure nepomuk/strigi even indexes files in that way yet. At least, not without my having to go in and manually tag every such file with what I consider to be important attributes, like 'Xilinx' and 'FIR Filter' and 'FPGA'. If the semantic desktop depends on that sort of manual tagging, it's doomed to failure. Humans like me are far too lazy for that. You'll need something like a future content-adddressable ext8 filesystem for that to work.

As a test, I personally verified the existence on my system of four PDFs from Xilinx that contained the exact phrase and the concept 'FIR filter'. I then tried a Dolphin contents search for that phrase. It found just one of the (at least) four files. ???? So tell me, exactly what information does this ~1G database contain? How am I supposed to use this semantic desktop? I keep reading that the KDE semantic desktop is 'much more than just a file indexer', but when I can't even get that to function correctly, then what?

So for the 90% of you on this thread that just want the thing to go away, I sympathize. This stuff seems alpha to the core, and to hard-depend on it is insane. (I also wonder what will happen to its development when the EU funding for it runs out.) However, my questions are for the other 10% of you that have actually tried to use it: How? Have you actually gotten this semantic desktop thing to actually work for some particular use-case?
Back to top
View user's profile Send private message
baaann
Guru
Guru


Joined: 23 Jan 2006
Posts: 497
Location: uk

PostPosted: Mon Dec 19, 2011 11:09 am    Post subject: Reply with quote

Quote:
(I also wonder what will happen to its development when the EU funding for it runs out.)

AFAIK the funding has finished, the lead dev Sebastian Trüg has his own fundraiser in order to continue development.

It is well worth reading his blog entries and the indication is that 4.8 will be vastly improved. With regard to PDF's this post advises that they are a work in progress with the metadata not being extracted as yet
Back to top
View user's profile Send private message
Dr.Willy
Guru
Guru


Joined: 15 Jul 2007
Posts: 346
Location: NRW, Germany

PostPosted: Mon Dec 19, 2011 11:32 am    Post subject: Reply with quote

radio_flyer wrote:
What I don't understand are the comments in this forum about KDE selling out to the 'social networking' crowd. Near as I can tell, this KDE semantic desktop stuff is more along the lines of the thoughts expressed in this article: http://radar.oreilly.com/2011/07/why-files-need-to-die.html

Quote:
In the world of linked data and semantically indexed information, saving or losing data is not something we'll have to worry about. The stream is saved. Think about it: You'd never have to organize your emails or project plans because everything would be there, as connected as the thoughts in your head. Collaborating and sharing would simply mean giving other people access to read from or contribute to part of your stream.

We already see a glimpse of this world when we look at Facebook. It's no wonder that it's so successful; it lets us deal with people, events, messages and photos — the real fabric of our everyday lives — not artificial constructs like files, folders and programs

Summary: "Wouldn't it be great if computers were as confused as our brains?"
Back to top
View user's profile Send private message
Tatsh
Tux's lil' helper
Tux's lil' helper


Joined: 22 Jul 2007
Posts: 102

PostPosted: Mon Dec 19, 2011 12:28 pm    Post subject: Reply with quote

Wow. This debate never ends.

But I'm not on the KDE side either. I just purged my system of anything and everything KDE PIM related. A few months back when KMail stopped working, I killed it really quick. But I decided to give it another try recently. Nope. Still crapping out the same way. It is completely unacceptable. And what about Kontact and all that? Not usable unless you can decode the error messages that Akonadi spits out and somehow fix the issues. Did I mention I get Nepomuk error messages without clear explanation on EVERY login?

I just don't understand it. And I mean it. I just do not understand why Akonadi/Strigi/Nepomuk was thought up, what it is for, and to be honest, I do NOT care. Would I like to search my PDF contents easily? Yes. Does it need to be in KDE? No! How about someone make pdfgrep (and a library too)? Wouldn't that be easier to use on an automated basis anyway?

I cannot believe this is where 'we' want things to go. I always tell people who want to try Linux: 'if you're not a developer of SOMETHING and you don't care about undocumented standards, it's pointless for you.' And I really do not care if things stay this way.

The entire 'Linux community' (and FOSS community) I thought was producing things by us for us. Why should this change? This model has proven the test of time otherwise the kernel would've been an abandoned project a long time ago because it was trying to 'guess' at what users want instead of waiting on their input (e.g. I need X device supported, submit patch or someone else adds support, new kernel module; it is an iterative process and done on an as-needed basis).
Back to top
View user's profile Send private message
albright
Advocate
Advocate


Joined: 16 Nov 2003
Posts: 2109
Location: Near Toronto

PostPosted: Mon Dec 19, 2011 2:00 pm    Post subject: Reply with quote

I think kde is great - except for the kdepim "suite", which is
at the moment just garbage.

But I think kde developers will evenutally fix kmail, kontact, etc

For now I can use thunderbird

I also want to say that for me, finally, strigi search is working OK
and is very useful
_________________
.... there is nothing - absolutely nothing - half so much worth
doing as simply messing about with Linux ...
(apologies to Kenneth Graeme)
Back to top
View user's profile Send private message
radio_flyer
Apprentice
Apprentice


Joined: 04 Nov 2004
Posts: 167
Location: Northern California

PostPosted: Mon Dec 19, 2011 6:29 pm    Post subject: Reply with quote

Dr.Willy wrote:
radio_flyer wrote:
What I don't understand are the comments in this forum about KDE selling out to the 'social networking' crowd. Near as I can tell, this KDE semantic desktop stuff is more along the lines of the thoughts expressed in this article: http://radar.oreilly.com/2011/07/why-files-need-to-die.html

Quote:
In the world of linked data and semantically indexed information, saving or losing data is not something we'll have to worry about. The stream is saved. Think about it: You'd never have to organize your emails or project plans because everything would be there, as connected as the thoughts in your head. Collaborating and sharing would simply mean giving other people access to read from or contribute to part of your stream.

We already see a glimpse of this world when we look at Facebook. It's no wonder that it's so successful; it lets us deal with people, events, messages and photos — the real fabric of our everyday lives — not artificial constructs like files, folders and programs

Summary: "Wouldn't it be great if computers were as confused as our brains?"


OK, I stand corrected :o

I do agree in one sense with the quote though--I've always thought the biggest mistake on the Internet was not mandating creation & modtime attributes in the HTML tag. That would have made web searches 1000x more fruitful.

Back to Nepomuk/Strigi/Akonodi though. Thanks for the links baaann, that helps explain things. I guess I thought it was going to be some sort of uber-search tool. That I could use. If it's primary use is to re-organize my desktop in the form of a Twitter feed, I'm not interested. Sorry for re-opening this.
Back to top
View user's profile Send private message
iandoug
Apprentice
Apprentice


Joined: 11 Feb 2005
Posts: 294
Location: Cape Town, South Africa

PostPosted: Tue Dec 20, 2011 11:13 am    Post subject: Reply with quote

Tatsh wrote:
Would I like to search my PDF contents easily? Yes. Does it need to be in KDE? No! How about someone make pdfgrep (and a library too)? Wouldn't that be easier to use on an automated basis anyway?


See Recoll in bgo-overlay

http://www.lesbonscomptes.com/recoll/

I've played with it, takes a while to index large drives....

cheers, Ian
_________________
Asus M3A78 64, X2 6000+, PX9800 GT, 4GB Ram | Asus M4A77TD PRO, X2 245, HD4350, 4GB RAM
Back to top
View user's profile Send private message
radio_flyer
Apprentice
Apprentice


Joined: 04 Nov 2004
Posts: 167
Location: Northern California

PostPosted: Tue Dec 20, 2011 4:43 pm    Post subject: Reply with quote

Here's an example of what I don't understand about nepomuk/strigi: I had a question about Gnu Make today. I know I have the 'Managing Project with Make' pdf e-book on my system. I opened up Dolphin, clicked on the 'Find' binoculars, and entered in 'gnu Managing make'. These are the results from nepomuksearch (as best as I can reproduce on these forums):
Code:

autobook_196.html                        /home/rdj/public_html/program/autobook-1.4
aw_pgsql_book.pdf                        /home/rdj/scratch/oldws/public_html/postgres
CVSQuickReference.pdf                 /home/rdj/public_html/program/quickref
index.html                                    /home/rdj/public_html
JackAudioRefMan.pdf                     /home/rdj/public_html/audio
Linux-Filesystem-Hierarchy.pdf        /home/rdj/scratch/oldws/public_html/program
Program-Library-HOWTO.pdf          /home/rdj/public_html/program
t2-handbook.pdf                            /home/rdj/public_html/program


This is after I've let strigi loose multiple times on my home directory with an ~1G database to show for it.

And here's what I get when I type 'locate Managing':
Quote:

/home/rdj/projects/gumstix/gumstix-oe/org.openembedded.snapshot/packages/samba/files/Managing-Samba.txt
/home/rdj/public_html/program/ManagingProjectsWithGNUMake-3.1.3.pdf


Note that locate's 'updatedb' runs from a cron job early in the morning and doesn't usually take more than a few minutes.

I don't see a problem sending beta code to early adopters for testing and feedback. But alpha code? If the code just plain doesn't work, there won't be much feedback because no one will use it. Either that, or there's something I seriously don't understand about how nepomuk search is supposed to work.

It also seems to me that full-content search and metadata tagging belongs at the filesystem level, preferably using a kernel-loadable optional filesystem. That would allow inotify to work properly, could provide proper journaling for the content index, and would make it easier to respect the Unix file permissions/attributes for privacy purposes. M$ had the right idea with WinFS, but a terrible implementation. Plus, if such a filesystem proved useful (and I think it would with the increasing size of hard disks) it would be so--optionally--across all the Linux desktop environments.

That Recoll link does look interesting.
Back to top
View user's profile Send private message
kimmie
Guru
Guru


Joined: 08 Sep 2004
Posts: 531
Location: Australia

PostPosted: Wed Dec 21, 2011 3:49 am    Post subject: Reply with quote

Most people are capable of managing things like cupboards, filing cabinets and even (shock) some simple hierarchical tree-based categorisation. Somebody teaches us how, or we work it out after a while. Some of us can even do tax returns without it being a major pain in the ass. Part of it is arriving at a workable system, another part is having the time-management skills to use it effectively.

But there are some people who just can't get the hang of it. Their desks tend to be surrounded by huge piles of paper. Things fall out of cupboards when you open them. They can't let go of old stuff or throw things out. They own lots of stuff they don't even know about. These people own computers, and their computers follow the same pattern. Their desktop is covered with a zillion icons. Every now and again they get worried about backing up, so they randomly copy stuff about and end up with multiple duplicates of the same files. They can't really use the organisational tools that OS's give them because they can't organise themselves.

For these people, desktop search is a really cool thing. For the rest of us, directories, filenames and the occasional locate or find work just fine and always will.

But THEN there's a guy I know who had Google Desktop, Copernic Desktop Search AND Windows Search all running on the same computer at the same time.

"You can't do that, you're shooting yourself in the foot. These programs are all competing with each other trying to index your whole computer, they're slowing it to a crawl."
"But I need them to find things."
"But why do you have three desktop search engines?"
"Three? Which three?"
"Google Desktop, Copernic and Windows Search."
"I don't know where Windows Search came from, I didn't install it."
"It probably came with an automatic update, but haven't you noticed it in the UI?"
"No."
"Ok, so why Google and Copernic?"
"They have different features, I use all of them."
"Which features?"
... turns out Copernic can find old emails in some format Google desktop has no clue about...
"But why do you need to find 7 year old emails?"
"I might need them."
"How often do you need them? Why do you have to find them on-line instantly?"
"I don't know."
"What about not having to wait 10 minutes everytime you load Firefox?"
"... silence..."

This person can take a brand new install and destroy it within a matter of a week or two. Never uninstalls a plugin or a program. Never deletes a duplicate shortcut. Every message arrives 5 times in their email inbox because of the maze of redirections they've configured. Desktop search is essential for this person, because without it, he can't find things he did two days ago. Even with desktop search it's a struggle, because he can't organise desktop search itself! How amazing is that? It's a miracle meta-fail.. what he really needs is a few decades of good psychiatry, but I suspect it's way too late for that.

But this gets me to thinking; I propose an entirely new multi-media, multi-sensory model, illustrated below:

(user copies a directory from desktop to his backup drive)
"Why are you doing that, Dave?"
"Siri, I'm backing it up."
"You did that yesterday, Dave. Only 1 file has changed since then."
"Siri, just copy it, I don't care."
"But it will take me three hours, Dave, and everything else I do will slow down in the meantime. Also, you did it last week as well."
(silence)
"Ok, Dave, I'll copy it."
(silence, Dave opens e-mail client, and then repeatedly clicks on unresponsive window)
"Dave, you're tickling me, please wait while I load your inbox!"
"Siri, I just want to check my mail."
"I know Dave, but your inbox has 627,546 items. Maybe I could archive some of them for you?"
"Siri, last time you did that you lost a whole lot of them."
"I'm sorry about that Dave, but I couldn't help it, you forced me to shut down while I was still busy."
"Siri, just show me my email!"
"Ok, Dave, your inbox will be loaded in about 7 minutes."
(silence)
(Dave tries to open porn site)
"Dave, your mother called again last night, didn't she?"
Back to top
View user's profile Send private message
Tatsh
Tux's lil' helper
Tux's lil' helper


Joined: 22 Jul 2007
Posts: 102

PostPosted: Wed Dec 21, 2011 1:32 pm    Post subject: Reply with quote

iandoug wrote:
Tatsh wrote:
Would I like to search my PDF contents easily? Yes. Does it need to be in KDE? No! How about someone make pdfgrep (and a library too)? Wouldn't that be easier to use on an automated basis anyway?


See Recoll in bgo-overlay

http://www.lesbonscomptes.com/recoll/

I've played with it, takes a while to index large drives....

cheers, Ian


Seriously, this is amazing. I indexed about 6600 PDFs/Word/CHM/etc in about 30 minutes and the search comes up in like a split second. This is invaluable!

Code:

echo 'app-misc/recoll ~amd64' >> /etc/portage/package.keywords
echo 'app-misc/recoll chm djvu dvi exif inotify msdoc msppt msxls pdf ps qt4 rtf session spell wordperfect xml -fam' >> /etc/portage/package.use
emerge recoll


Seriously, nepomuk has NOTHING on recoll.
Back to top
View user's profile Send private message
albright
Advocate
Advocate


Joined: 16 Nov 2003
Posts: 2109
Location: Near Toronto

PostPosted: Wed Dec 21, 2011 4:30 pm    Post subject: Reply with quote

Not having much luck here; the indexer starts asking
questions (luckily I started in a terminal or wouldn't
have seen it), such as:
Quote:
replace /tmp/rclsoff_tmp19926/rclsofftmp/mimetype? [y]es, [n]o, [A]ll, [N]one, [r]ename:


and then hangs on a certain file (an rtf for what that's worth).

If I try to "stop indexing" nothing happens ... program appears to hang ...

Seem to be roughly as reliable and effective as strigi/nepomuk ... :cry:
_________________
.... there is nothing - absolutely nothing - half so much worth
doing as simply messing about with Linux ...
(apologies to Kenneth Graeme)
Back to top
View user's profile Send private message
iandoug
Apprentice
Apprentice


Joined: 11 Feb 2005
Posts: 294
Location: Cape Town, South Africa

PostPosted: Thu Dec 22, 2011 12:56 pm    Post subject: Reply with quote

albright wrote:
Not having much luck here; the indexer starts asking
questions (luckily I started in a terminal or wouldn't
have seen it), such as:
Quote:
replace /tmp/rclsoff_tmp19926/rclsofftmp/mimetype? [y]es, [n]o, [A]ll, [N]one, [r]ename:


and then hangs on a certain file (an rtf for what that's worth).



I seem to remember there were some issues regarding rtf files ... needed to install some helper IIRC.

Unfortunately the upgrade to kde 4.7 and associated Kmail buggered my system (been fighting with this since Saturday, eventually have now tried to revert to old Kmail) so I can't dig up the relevant messages between myself and the dev, who is very helpful.

will revert later.
_________________
Asus M3A78 64, X2 6000+, PX9800 GT, 4GB Ram | Asus M4A77TD PRO, X2 245, HD4350, 4GB RAM
Back to top
View user's profile Send private message
gerard82
Advocate
Advocate


Joined: 04 Jan 2004
Posts: 2229
Location: Netherlands

PostPosted: Thu Dec 22, 2011 4:05 pm    Post subject: Reply with quote

Yesterday I installed recoll-1.16.2 on a separate instance of Gentoo Linux.
It ran fine.
So today I did the same on my regular desktop.
There's only one word for it: Formidable!
Works nicely with KDE.
Why can't KDE come up with something similar?
Not enough bling I guess.
Gerard.
_________________
To install Gentoo I use sysrescuecd.Based on Gentoo,has firefox to browse Gentoo docs and mc to browse (and edit) files.
The same disk can be used for 32 and 64 bit installs.
You can follow the Handbook verbatim.
http://www.sysresccd.org/Download
Back to top
View user's profile Send private message
iandoug
Apprentice
Apprentice


Joined: 11 Feb 2005
Posts: 294
Location: Cape Town, South Africa

PostPosted: Thu Dec 22, 2011 4:18 pm    Post subject: Reply with quote

iandoug wrote:
albright wrote:
Not having much luck here; the indexer starts asking
questions (luckily I started in a terminal or wouldn't
have seen it), such as:
Quote:
replace /tmp/rclsoff_tmp19926/rclsofftmp/mimetype? [y]es, [n]o, [A]ll, [N]one, [r]ename:


and then hangs on a certain file (an rtf for what that's worth).



I seem to remember there were some issues regarding rtf files ... needed to install some helper IIRC.



Having successfully (so far, so good) downgraded back to KMail 1, I looked for the mails and they were about .rar not .rtf files, sorry.

However my memory banks remind me that there are several 'versions' of RTF files, and resulting incompatibility issues:
http://en.wikipedia.org/wiki/Rich_Text_Format

I suggest you take the matter up with recoll's author, he's very helpful.

If anyone can find out why recoll is not in gentoo, that would also be cool :-)

thanks, Ian
_________________
Asus M3A78 64, X2 6000+, PX9800 GT, 4GB Ram | Asus M4A77TD PRO, X2 245, HD4350, 4GB RAM
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2, 3, 4, 5  Next
Page 3 of 5

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum