Regular Expression: greedy option

Started by Stefan, November 30, 2008, 11:12:01 PM

Previous topic - Next topic

Stefan

Hi alex,

RegEx in HippoEDIT works lazy (non-greedy)

How can i search greedy?


f.ex.

I have this text
Quote
start
HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!


Test 1:

if i was on line 'start'
and want to search for 'eat!'
with .+eat (RegEx and Ext. Selection are enabled)
i get only this selected:
Quote
start
HippoEDIT is great!

HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!
This was lazy.


Test 2:

What should i do to search greedy to get:
Quote
start
HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!


-----

I think

.*?eat!

should find lazy like:

Quote
start
HippoEDIT is great!

HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!



and

.*eat!

should find greedy like

Quote
start
HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!
HippoEDIT is great!


---


What do you think?

May i suggest to let  the RegEx  be "standard" i.e.: greedy
and allow an '?'-sign to switch to non-greedy?
Stefan, HippoEDIT beta tester 
HippoEDIT - the editor programmers wants to code thyself when they are dreaming.        -Don't just edit. HippoEDIT!-

alex

Hello Stefan,

... unfortunately I think this is also bug. Greedy search should also work, but due to implementation algorithm it is now not possible. I think I need to redo regexp searching, and this is not very easy...
Hippoedit stores text as lines (without line breaks), and then, before search, analyzing regular expression, I try to find how many lines it can request and provide such multiline blocks to search engine one by one. And of course it is not possible to predict how many lines would be necessary with expression like this ".+eat"..

I think I need to switch to stream seach for regexp, it would decrease performance of regular expression search, but would be less buggy and more flexible. When - dont know yet. This would be a big change and I prefer to move this topic to 1.5, and release 1.4 with quick fixes (to at least not have crash in search for cases with multiline search) because otherwise I can delay 1.4 and introduce new bugs..
HippoEDIT team
[url="http://www.hippoedit.com/"]http://www.hippoedit.com/[/url]

Stefan

An another issue ==> RegEx Find: ^$ didn't work,
HE says "Cannot find string '^$' "
It's expected to find empty line.
Stefan, HippoEDIT beta tester 
HippoEDIT - the editor programmers wants to code thyself when they are dreaming.        -Don't just edit. HippoEDIT!-

alex

HippoEDIT team
[url="http://www.hippoedit.com/"]http://www.hippoedit.com/[/url]

Stefan

Fixed: RegEx  ^$  find empty line now. Thanks.
ToDo: correct greedy behaviour
Stefan, HippoEDIT beta tester 
HippoEDIT - the editor programmers wants to code thyself when they are dreaming.        -Don't just edit. HippoEDIT!-

Stefan

#5
I have re-thinked my explanation:
the problem is not an greedy issue as it, .... but an issue with "dot match new line" (?m)

Because i see now that greedy and lazy option works with HippoEDIT too:


I have this text:
HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!
---------

I search for RegEx .*eat and get

HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!

So RegEx greedy works.
---------

I search RegEx .*?eat and get
HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!

So RegEx lazy works too.
-----------

What i meant above was: search RegEx (?m).*eat above multi lines, to get
HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great
!






EDIT:


^.*(\r|\n)*.*eat.*$
   .*(\r|\n)*.*eat.*

match on two lines, not on three or four. Perhaps i get it if i try harder?


HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!

HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!

HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!
HippoEDIT is great!HippoEDIT is great!
Stefan, HippoEDIT beta tester 
HippoEDIT - the editor programmers wants to code thyself when they are dreaming.        -Don't just edit. HippoEDIT!-

alex

Hi Stefan,

yes, the problem is once more line based seach in current implmentation. Because what I am doing, I am checking how many lines you want by counting \r\n in search string, and then prepare block for search combining N lines.
I see that this is incorrect and incomplete logic and would rework it in 1.50. Dirty work but necessary ;)
HippoEDIT team
[url="http://www.hippoedit.com/"]http://www.hippoedit.com/[/url]