Author Topic: File encoding and storage  (Read 1994 times)

Offline theboa

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
File encoding and storage
« on: June 06, 2012, 03:59:50 PM »
I have a project containing multiple .php files with cp1251 encoding. HE opens these files as utf-8 by default but allows changing the encoding (View >> Encoding) and remembers it in the future.

The questions are:
1. How does HE determine the encoding of the file? (my php's does not contain any <meta> tags inside)
2. Where HE stores the encoding information for particular file? I would like to copy this information to another PC if possible.

Thank you!

Offline alex

  • Developer
  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2214
  • Karma: +37/-3
    • HippoEDIT
Re: File encoding and storage
« Reply #1 on: June 06, 2012, 07:45:39 PM »
Quote
1. How does HE determine the encoding of the file? (my php's does not contain any <meta> tags inside)
Some info: How encoding detection works in HippoEDIT, How to set default encoding for specific syntax.

In particular case of the php, utf-8comes from XML definition (xml_spec.xml), where utf-8 set as default encoding.
HTML is inherited from XML, and php files are html files. PHP itself is embedded withing <?php ?> inside HTML.

Quote
2. Where HE stores the encoding information for particular file? I would like to copy this information to another PC if possible.
This information stored in %TEMP%\hefilepref.tmp and PC dependent (contains absolute paths). Because it is binary, it is also dependent on HE version (sometimes) and technology (x86/x64). So copying of it not a good idea...
When designing I thought that file specific data should be temporal, not human readable but fast to restore and shall not be copied.
While navigation data as bookmarks, cursor position and folding state should be persistent aside as satellite file and can be copied together with a file.
Another reason why I decided to not put this data in satellite file (first was performance) that I want to limit number of such heinf files, to avoid exploding of some service files in work directories.

Later I have introduced directory information file and global, that probably will solve problem of one file + one satellite heinf but decision with tmp file was already done.
The best probably will be usage of alternative stream for that (Tools->General->Editor->Information Storage), but here you have limitation of NTFS5 only, plus as I read it can be at all dropped in new files system comes with/later Windows 8.

What else is stored for file:

int      m_eCRLFStyle    : 5;
UINT  m_eWriteBOM      : 2;
UINT  m_bAutoReload    : 2;
UINT  m_bReadOnly      : 1;
UINT  m_bAutoSave      : 2;
UINT   m_bRuler      : 2;
UINT   m_bMargin      : 2;
UINT   m_bChangedLines    : 2;
UINT   m_bLineNumbers    : 2;
UINT   m_bOutlining    : 2;
UINT   m_bPageWidth    : 2;
UINT   m_bIndentGuides    : 2;
UINT   m_bScopeSeparator  : 2;
UINT   m_eWordWrap      : 3;
UINT   m_bVirtSpaces    : 2;
UINT   m_bInactiveCode    : 2;
UINT   m_bWhiteSpace    : 2;
UINT   m_bTrailWhiteSpace  : 2;
UINT   m_bCurrentLine    : 2;
UINT   m_bNavigationBar  : 2;
UINT   m_bHierarchyBar    : 2;
UINT   m_bOverviewBar    : 2;
UINT   m_bSearchBar    : 2;
UINT  m_nIndent      : 8;
UINT  m_bSearchCaseSensit : 2;
UINT  m_bSearchIncremental: 2;
UINT  m_bSearchWholeWord  : 2;
UINT  m_bScopeNesting    : 2;
UINT  m_bAlternatingMode  : 2;
UINT  m_bSearchRegexp    : 2;

CPLanguage*    m_pLanguage;
UINT      m_nCodePage;

Offline theboa

  • Newbie
  • *
  • Posts: 6
  • Karma: +0/-0
Re: File encoding and storage
« Reply #2 on: June 06, 2012, 11:37:04 PM »
Thank you Alex! That was an exhaustive explanation!